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Preface 


Coding theory began in the late 1940’s with the work of Golay, Hamming 
and Shannon. Although it has its origins in an engineering problem, the 
subject has developed by using more and more sophisticated mathematical 
techniques. It is our goal to present the theory of error-correcting codes in a 
simple, easily understandable manner, and yet also to cover all the important 
aspects of the subject. Thus the reader will find both the simpler families of 
codes -for example, Hamming, BCH, cyclic and Reed-Muller codes — dis- 
cussed in some detail, together with encoding and decoding methods, as well 
as more advanced topics such as quadratic residue, Golay, Goppa, alternant, 
Kerdock, Preparata, and self-dual codes and association schemes. 

Our treatment of bounds on the size of a code is similarly thorough. We 
discuss both the simpler results—the sphere-packing, Plotkin, Elias and 
Garshamov bounds - as well as the very powerful linear programming method 
and the McEliece-Rodemich-Rumsey-Welch bound, the best asymptotic 
result known. An appendix gives tables of bounds and of the best codes 
presently known of length up to 512. 

Having two authors has helped to keep things simple: by the time we both 
understand a chapter, it is usually transparent. Therefore this book can be 
used both by the beginner and by the expert, as an introductory textbook and 
as a reference book, and both by the engineer and the mathematician. Of 
course this has not resulted in a thin book, and so we suggest the following 
menus: 

An elementary first course on coding theory for mathematicians: Ch. 1, Ch. 
2 ($6 up to Theorem 22), Ch. 3, Ch. 4 (§§1-5), Ch. 5 (to Problem 5), Ch. 7 (not 
$87, 8), Ch. 8 (881-3), Ch. 9 (881, 4), Ch. 12 (88), Ch. 13 (881-3), Ch. 14 
($81-3). 

A second course for mathematicians: Ch. 2 (881-6, 8), Ch. 4 (886, 7 and 
part of 8), Ch. 5 (to Problem 6, and $83, 4, 5, 7), Ch. 6 ($813, 10, omitting the 
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proof of Theorem 33), Ch. 8 ($85, 6), Ch. 9 ($82, 3, 5), Ch. 10 (§§1-5, 11), Ch. 11, 
Ch. 13 (884, 5, 9), Ch. 16 (§§1-6), Ch. 17 (87, up to Theorem 35), Ch. 19 ($81—3). 

An elementary first course on coding theory for engineers: Ch. 1, Ch. 3, 
Ch. 4 ($81—5), Ch. 5 (to Problem 5), Ch. 7 (not 87), Ch. 9 (881, 4, 6), Ch. 10 
(881, 2, 5, 6, 7, 10), Ch. 13 (881-3, 6, 7), Ch. 14 ($81, 2, 4). 

A second course for engineers: Ch. 2 ($81-6), Ch. 8 ($813, 5, 6), Ch. 9 
($82, 3, 5), Ch. 10 ($11), Ch. 12 (881-3, 8, 9), Ch. 16 (881, 2, 4, 6, 9), Ch. 17 (87, 
up to Theorem 35). 

There is then a lot of rich food left for an advanced course: the rest of 
Chapters 2, 6, 11 and 14, followed by Chapters 15, 18, 19, 20 and 21 — a feast! 


The following are the principal codes discussed: 


Alternant, Ch. 12; 

BCH, Ch. 3, 881, 3; Ch. 7, 86; Ch. 8, 85; Ch. 9; Ch. 21, $88; 
Chien-Choy generalized BCH, Ch. 12, 87; 
Concatenated, Ch. 10, $11; Ch. 18, $85, 8; 
Conference matrix, Ch. 2, 84; 

Cyclic, Ch. 7, Ch. 8; 

Delsarte-Goethals, Ch. 15, $5; 

Difference-set cyclic, Ch. 13, 88; 

Double circulant and quasi-cyclic, Ch. 16, $86-8; 
Euclidean and projective geometry, Ch. 13, $8; 
Goethals generalized Preparata, Ch. 15, 87; 
Golay (binary), Ch. 2, $6; Ch. 16, $2; Ch. 20; 
Golay (ternary), Ch. 16, 82; Ch. 20; 

Goppa, Ch. 12, $83-5; 

Hadamard, Ch. 2, $83; 

Hamming, Ch. 1, $7, Ch. 7, 83 and Problem 8; 
Irreducible or minimal cyclic, Ch. 8, $83, 4; 
Justesen, Ch. 10, $11; 

Kerdock, Ch. 2, 88; Ch. 15, $5; 

Maximal distance separable, Ch. 11; 
Nordstrom-Robinson, Ch. 2, $8; Ch. 15, $85, 6; 
Pless symmetry, Ch. 16, $8; 

Preparata, Ch. 2, $8; Ch. 15, $6; Ch. 18, $7.3; 
Product, Ch. 18, $82-6; 

Quadratic residue, Ch. 16; 

Redundant residue, Ch. 10, $9; 

Reed-Muller, Ch. 1, $9; Chs. 13-15; 
Reed-Solomon, Ch. 10; 
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Self-dual, Ch. 19; 
Single-error-correcting nonlinear, Ch. 2, §7; Ch. 18, §7.3; 
Srivastava, Ch. 12, §6. 


Encoding methods are given for: 


Linear codes, Ch. 1, 82; 

Cyclic codes, Ch. 7, §8; 

Reed-Solomon codes, Ch. 10, 87; 
Reed-Muller codes, Ch. 13, $86, 7; Ch. 14, 84. 


Decoding methods are given for: 


Linear codes, Ch. 1, $83, 4; 

Hamming codes, Ch. 1, $7; 

BCH codes, Ch. 3, 83; Ch. 9, $6; Ch. 12, $9; 

Reed-Solomon codes, Ch. 10, $10; 

Alternant (including BCH, Goppa, Srivastava and Chien-Choy generalized 
BCH codes) Ch. 12, $9; 

Quadratic residue codes, Ch. 16, $9; 

Cyclic codes, Ch. 16, $9, 


while other decoding methods are mentioned in the notes to Ch. 16. 


When reading the book, keep in mind this piece of advice, which should be 
given in every preface: if you get stuck on a section, skip it, but keep reading! 
Don't hesitate to skip the proof of a theorem: we often do. Starred sections 
are difficult or dull, and can be omitted on the first (or even second) reading. 

The book ends with an extensive bibliography. Because coding theory 
overlaps with so many other subjects (computers, digital systems, group 
theory, number theory, the design of experiments, etc.) relevant papers may 
be found almost anywhere in the scientific literature. Unfortunately this 
means that the usual indexing and reviewing journals are not always helpful. 
We have therefore felt an obligation to give a fairly comprehensive bi- 
bliography. The notes at the ends of the chapters give sources for the 
theorems, problems and tables, as well as small bibliographies for some of the 
topics covered (or not covered) in the chapter. 

Only block codes for correcting random errors are discussed; we say little 
about codes for correcting other kinds of errors (bursts or transpositions) or 
about variable length codes, convolutional codes or source codes (see the 
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Notes to Ch. 1). Furthermore we have often considered only binary codes, 
which makes the theory a lot simpler. Most writers take the opposite point of 
view: they think in binary but publish their results over arbitrary fields. 

There are a few topics which were included in the original plan for the 
book but have been reluctantly omitted for reasons of space: 

(i) Gray codes and snake-in-the-box codes -see Adelson et al. [5.6], 
Buchner (210], Cavior [253], Chien et al. [290], Cohn [299], Danzer and Klee 
[328], Davies [335], Douglas [382,383], Even [413], Flores [432], Gardner 
[468], Gilbert [481], Guy [571], Harper [605], Klee [764—767], Mecklenberg et 
al. [951], Mills [956], Preparata and Nievergelt [1083], Singleton [1215], Tang 
and Liu [1307], Vasil'ev [1367], Wyner [1440] and Yuen [1448, 1449]. 

(ii) Comma-free codes—see Ball and Cummings [60,61], Baumert and 
Cantor [85], Crick et al. [316], Eastman [399], Golomb [523, pp. 118-122], 
Golomb et al. [528], Hall (587, pp. 11-12], Jiggs [692], Miyakawa and Moriya 
[967], Niho [992] and Redinbo and Walcott [1102]. See also the remarks on 
codes for synchronizing in the Notes to Ch. 1. 

(iii) Codes with unequal error protection—see Gore and Kilgus [549], 
Kilgus and Gore [761] and Mandelbaum [901]. 

(iv) Coding for channels with feedback -see Berlekamp [124], Horstein 
[664] and Schalkwijk et al. [1153-1155]. 

(v) Codes for the Gaussian channel- see Biglieri et al. [148-151], Blake 
[155, 156, 158], Blake and Mullin [162], Chadwick et al. [256, 257], Gallager 
[464], Ingemarsson [683], Landau [791], Ottoson [1017], Shannon [1191], 
Slepian [1221-1223] and Zetterberg [1456]. 

(vi) The complexity of decoding- see Bajoga and Walbesser [59], Chaitin 
[257a-258a], Gelfand et al. [471], Groth [564], Justesen [706], Kolmogorov 
[774a], Marguinaud [916], Martin-Lóf [917a], Pinsker [1046a], Sarwate [1145] 
and Savage [1149-1152a]. 

(vii) The connections between coding theory and the packing of equal 
spheres in n-dimensional Euclidean space - see Leech [803-805], [807], Leech 
and Sloane [808-810] and Sloane [1226]. 

The following books and monographs on coding theory are our predeces- 
sors: Berlekamp [113, 116], Blake and Mullin [162], Cameron and Van Lint 
[234], Golomb [522], Lin [834], Van Lint [848], Massey [922a], Peterson 
[1036a], Peterson and Weldon [1040], Solomon [1251] and Sloane [1227a]; 
while the following collections contain some of the papers in the bibliography: 
Berlekamp [126], Blake [157], the special issues [377a, 678, 679], Hartnett 
[620], Mann [909] and Slepian [1224]. See also the bibliography [1022]. 

We owe a considerable debt to several friends who read the first draft very 
carefully, made numerous corrections and improvements, and frequently 
saved us from dreadful blunders. In particular we should like to thank I.F. 
Blake, P. Delsarte, J.-M. Goethals, R.L. Graham, J.H. van Lint, G. Longo, 
C.L. Mallows, J. McKay, V. Pless, H.O. Pollak, L.D. Rudolph, D.W. Sar- 
wate, many other colleagues at Bell Labs, and especially A.M. Odlyzko for 
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their help. Not all of their suggestions have been followed, however, and the 
authors are fully responsible for the remaining errors. (This conventional 
remark is to be taken seriously.) We should also like to thank all the typists at 
Bell Labs who have helped with the book at various times, our secretary 
Peggy van Ness who has helped in countless ways, and above all Marion 
Messersmith who has typed and retyped most of the chapters. Sam Lomonaco 
has very kindly helped us check the galley proofs. 
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We should like to thank many friends who have pointed out errors and 
misprints. The corrections have either been made in the text or are listed below. 

A Russian edition was published in 1979 by Svyaz (Moscow), and we are 
extremely grateful to L. A. Bassalygo, I. T. Grushko and V. A. Zinov’ev for 
producing a very careful translation. They supplied us with an extensive list of 
corrections. They also point out (in footnotes to the Russian edition) a number of 
places we did not cite the earliest source for a theorem. We have corrected the 
most glaring omissions, but future historians of coding theory should also 
consult the Russian edition. 

Problem 17, page 75. Shmuel Schreiber has pointed out that not all ways of 
choosing the matrices A, B, C, D work. One choice which does work is 


0001 0010 0100 1000 

1000 _ | 0100 _ | 0010 _ | 0001 
A=! ooro 3 =) 0001) ©=! 1000) =| o100 |: 

0100 1000 0001 0010 


Page 36, Notes to §2. Add after Wu [1435, 1436]: K. Sh. Zigangirov, Some 
sequential decoding procedures, Problems of Information Transmission, 2 (4) 
(1966) 1-10; and K. Sh. Zigangirov, Sequential Decoding Procedures (Svyaz, 
Moscow, 1974). 

Page 72, Research problem 2.4. It is now known that A(10, 4) = 40, 72 = 
A(11, 4) x 79, and 144 A(12, 4) x 158. See M. R. Best, Binary codes with a 
minimum distance of four, IEEE Trans. Info. Theory, V ol. IT-26 (6) (November 
1980), 738—743. 

Page 123, Research problem 4.1. Self-contained proofs have been given by O. 
Moreno, On primitive elements of Trace 1l in GFQ"), Discrete Math., to appear, 
and L. R. Vermani, Primitive elements with nonzero trace, preprint, 
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Chapter 6, pp. 156 and 180. Theorem 33 was proved independently by Zinov'ev 
and Leont'ev [1472]. 

Page 166, Research problem 6.1 has been settled in the affirmative by I. I. 
Dumer, A remark about codes and tactical configurations, Math. Notes, to 
appear; and by C. Roos, Some results on t-constant codes, IEEE Trans. Info. 
Theory, to appear. 

Page 175. Research problem 6.3 has also been solved by Dumer and Roos [op. 
cit.]. 

Page 178-179. Research problems 6.4 and 6.5 have been solved by Dumer [op. 
cit.]. The answer to 6.5 is No. 

Page 267, Fig. 9.1. R. E. Kibler (private communication) has pointed out that 
the asterisks may be removed from the entries (127, 29, 43], [255, 45, 87] and [255, 
37, 91], since there are minimum weight codewords which are low-degree 
multiples of the generator polynomial. 

Page 280, Research problem 9.4. T. Helleseth (IEEE Trans. Info. Theory, Vol. 
IT-25 (1979) 361—362) has shown that no other binary primitive BCH codes are 
quasi-perfect. 

Page 299, Research problem 10.1. R. E. Kibler (private communication) has 
found a large number of such codes. 

Page 323. The proof of Theorem 9 is valid only for q = 2". For odd q the code 
need not be cyclic. See G. Falkner, W. Heise, B. Kowol and E. Zehender, On the 
existence of cyclic optimal codes, Atti Sem. Mat. Fis. Università di Modena, to 
appear. 

Page 394,line 9 from the bottom. As pointed out by Massey in [918, p. 100], the 
number of majority gates required in an L-step decoder need never exceed the 
dimension of the code. 

Page 479. Research problem 15.2 is also solved in I. I. Dumer, Some new 
uniformly packed codes, in: Proceedings MFTI, Radiotechnology and Elec- 
tronics Series (MFTI, Moscow, 1976) pp. 72-78. 

Page 546. The answer to Research problem 17.6 is No, and in fact R. E. Kibler, 
Some new constant weight codes, IEEE Trans. Info. Theory, Vol. IT-26 (May 
1980) 364—365, shows that 27 = A(24, 10, 8) 68. 

Appendix A, Figures 1 and 3. For later versions of the tables of A(n, d) and 
A(n, d, w) see R. L. Graham and N. J. A. Sloane, Lower bounds for constant 
weight codes, IEEE Trans. Info. Theory, Vol. IT-26 (1980) 37-43; M. R. Best, 
Binary codes with a minimum distance of four, loc. cit., Vol. IT-26 (1980), 738—743; 
and other papers in this journal. 

On page 682, line 6, the value of X corresponding to F — 30 should be changed 
from .039 to .093. 


Linear codes 


§1. Linear codes 


Codes were invented to correct errors on noisy communication channels. 
Suppose there is a telegraph wire from Boston to New York down which 0’s 
and l's can be sent. Usually when a 0 is sent it is received as a 0, but 
occasionally a 0 will be received as a 1, or a 1 as a 0. Let's say that on the 
average | out of every 100 symbols will be in error. Le. for each symbol there 
is a probability p — 1/100 that the channel will make a mistake. This is called a 
binary symmetric channel (Fig. 1.1). 

There are a lot of important messages to be sent down this wire, and they 
must be sent as quickly and reliably as possible. The messages are already 
written as a string of 0’s and l's- perhaps they are being produced by a 
computer. 

We are going to encode these messages to give them some protection 
against errors on the channel. A block of k message symbols u = uiu; ... uy 


1-p 


4 4 


1-p 
SEND RECEIVE 


Fig. 1.1. The binary symmetric channel, with error probability p. In general 0 x p <3. 


2 Linear codes Ch. 1. §1. 
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u4U2'* 


NOISE 


Fig. 1.2 


(u; = 0 or 1) will be encoded into a codeword x = xix;... x, (xi 2 0 or 1) where 
n = k (Fig. 1.2); these codewords form a code. 

The method of encoding we are about to describe produces what is called a 
linear code. The first part of the codeword consists of the message itself: 


XS Un, X27 U2, ..., Xk — Hs 
followed by n — k check symbols 
Xk^15 2 ee . Xn- 


The check symbols are chosen so that the codewords satisfy 
H| | = Hx" =0, (1) 


where the (n — k) xn matrix H is the parity check matrix of the code, given 
by 
H =[A|I,-«], (2) 


A is some fixed (n — k) X k matrix of 0’s and I’s, and 


(1, 0 
I. = (o 1 -. jj 


is the (n — k) x (n — k) unit matrix. The arithmetic in Equation (1) is to be 
performed modulo 2, i.e. 0-151, 121-50, —1— +1. We shall refer to this as 
binary arithmetic. 


Example. Code # 1. The parity check matrix 


i 
010 
001 6) 
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The message uiu;u, is encoded into the codeword x = xixixsxaxsxs, which 
begins with the message itself: 


Xi =U, X2=U2, X3= s, 
followed by three check symbols x4xsx« chosen so that Hx“ = 0, i.e. so that 


Xi X+ X4— 0, 
Xi t X4. X5— 0, (4) 
Xi t X4 X67 OQ. 


If the message is u = 011, then x,=0, x2= 1, x= 1, and the check symbols 
are 
x,=—-1-1=1+1=2=0, 


xs=—-1=1, x=-1=1, 


so the codeword is x = O11011. 

The Equations (4) are called the parity check equations, or simply parity 
checks, of the code. 

The first parity check equation says that the 2", 37 and 4" symbols of 
every codeword must add to 0 modulo 2; i.e. their sum must have even parity 
(hence the name!). 

Since each of the 3 message symbols uuu, is 0 or 1, there are altogether 
2? = 8 codewords in this code. They are: 


000000 011011 110110 
001110 100011 111000. 
010101 101101 


In the general code there are 2“ codewords. 

As we shall see, code # 1 is capable of correcting a single channel error (in 
any one of the six symbols), and using this code reduces the average 
probability of error per symbol from p = .01 to .00072 (see Problem 24). This 
is achieved at the cost of sending 6 symbols only 3 of which are message 
symbols. 

We take (1) as our general definition: 


Definition. Let H be any binary matrix. The linear code with parity check 
matrix H consists of all vectors x such that 


Hx" =0. 


(where this equation is to be interpreted modulo 2). 


It is convenient, but not essential, if H has the form shown in (2) and (3), in 
which case the first k symbols in each codeword are message or information 
symbols, and the last n — k are check symbols. 


4 Linear codes Ch. 1. §1. 


Linear codes are the most important for practical applications and are the 
simplest to understand. Nonlinear codes will be introduced in Ch. 2. 


Example. Code # 2, a repetition code. A code with k = 1, n = 5, and parity 
check matrix 


l 


l 

l l (blanks denote zeros). 

| l 

Each codeword contains just one message symbol u. The parity check 
equations are 


xi +x. — 0, Xi +x, = 0, Xict x4 — 0, Xi xs 0, 


Le. xy X2 Xs x47 Xs u. So there are only two codewords, 00000 and 
11111. The message symbol is simply repeated 5 times: this is called a 
repetition code. 


Example. Code # 3, an even weight code. A code with k —3, n = 4 and parity 
check matrix H — (1111). Each codeword contains 3 message symbols xix-x« 
and one check symbol x4= xı x: * x. The 2' 2 8 codewords are 0000, 0011, 
0101, 1001, 0110, 1010, 1100, 1111, i.e. all vectors with an even number of 1’s. 


Problems. (1) Code #4 has parity check matrix 


m- [99] 


List all the codewords. Repeat for 


| [0111 
Hs | n 
How are these codes related? 
(2) Code #5 has parity check matrix 
0111100 
H = | 1011010]. 


1101001 


List all the codewords. 

(3) If p >: in Fig. 1.1, show that interchanging the names of the received 
symbols changes this to a binary symmetric channel with p <+. If p =: show 
that no communication is possible. 
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82. Properties of a linear code 


(i) The definition again: x —^x1::* x, is a codeword if and only if 
Hx" - 0. (1) 
(ii) Usually the parity check matrix H is an (n — k) x n matrix of the form 
H - [A | hx], Q) 


and as we have seen there are 2" codewords satisfying (1). (This is still true 
even if H doesn't have this form, provided H has n columns and n —k 
linearly independent rows.) When H has the form (2), the codewords look like 
this: 
XSEX Xe Mert Xn. 

eS —— 

message check 

symbols symbols 


(iii) The generator matrix. If the message is u= uı'''uk, what is the 
corresponding codeword x —x,::: x,? First x; = Mi, ..., Xk = Uk, OF 


Xi uy 
; )- L| : ) I, = unit matrix. (5) 
Xk Uk 


Then from (1) and (2), 


uy 
= -a ; ) from (5). © 


Uy 


In the binary case —A=A, but later we shall treat cases where — Ax A. 
Putting (5) on top of (6): 





and transposing, we get 
x=uG (7) 
where 
G =[k | - A*]. (8) 
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If H is in standard form G is easily obtained from H - see (2). G is called a 
generator matrix of the code, for (7) just says that the codewords are all 
possible linear combinations of the rows of G. (We could have used this as 
the definition of the code.) (1) and (7) together imply that G and H are related 
by 

GH"=0 or HG"-0. (9) 


Example. Code # 1 (cont.). A generator matrix is 





100 | O11 row | 
G=[I,|—A‘"]=|010 | 101| row2 
001 110] row 3. 


The 8 codewords are (from (7)) 
ui -row 1+u2-row 2+ u4;-row 3. (u, uz, us— O or 1). 
We see once again that the codeword corresponding to the message u = 011 is 
x=uG 
=row 2+row 3 


= 010101 + 001110 
= 011011, 


the addition being done mod 2 as usual. 


(iv) The parameters of a linear code. The codeword x = x, >+ x, is said to 
have length n. This is measuring the length as a tailor wouid, not a 
mathematician. n is also called the block length of the code. If H has n - k 
linearly independent rows, there are 2° codewords. k is called the dimension 
of the code. We call the code an [n, k] code. 

This code uses n symbols to send k message symbols, so it is said to have 
rate or efficiency R = kin. 

(v) Other generator and parity check matrices. A code can have several 
different generator matrices. E.g. 


k l a ION 1 
0101 J' 0101 
are both generator matrices for code £ 4. In fact any maximal set of linearly 
independent codewords taken from a given code can be used as the rows of a 
generator matrix for that code. 

A parity check on a code € is any row vector h such that Ax" =0 for all 


codewords x € €. Then similarly any maximal set of linearly independent 
parity checks can be used as the rows of a parity check matrix H for €. E.g. 


Pa d 
11017 1101 


are both parity check matrices for code £ 4. 
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(vi) Codes over other fields. Instead of only using 0’s and I's, we could have 
allowed the symbols to be from any finite field (see Chapters 3 and 4). For 
example a ternary code has symbols 0, 1 and 2, and all calculations of parity 
checks etc. are done modulo 3 (1+2=0, ~-1=2, 2+2=1, 2: 2- 1, etc.). If 
the symbols are from a finite field with q elements, the code is still defined by 
(1) and (2), or equivalently by (7) and (8), and an [n,k] code contains q“ 
codewords. 


Example. Code £ 6. A [4,2] ternary code with parity check matrix 


1110] _ 
H = [i21] = 1418) 


The generator matrix is (from (8)) 


sen row 1 
0121 row 2. 


G={h|-a"1=| 


There are 9 codewords Ui TOW 1+ us: row2 (ui,u» —0, 1 or 2), as follows: 


message codeword message codeword message codeword 


u x u x u x 

00 0000 10 1022 20 2011 
01 0121 11 1110 21 2102 
02 0212 12 1201 22 2220 


This code has rate R = k/n — 5. 


(vii) Linearity. If x and y are codewords of a given code, so is x+y, 
because H (x + y)” = Hx" + Hy" =0. If c is any element of the field, then cx is 
also a codeword, because H(cx)" = cHx" =0. E.g. in a ternary code if x is a 
codeword so is 2x = —x. That is why these are called linear codes. Such a 
code is also an additive group, and a vector space over the field. 


Problems. (4) Code # 2 (cont.). Give parity check and generator matrices for 
the general [n, 1] repetition code. 

(5) Code #3 (cont.). Give parity check and generator matrices for the 
general [n, n — 1] even weight code. 

(6) If the code € has an invertible generator matrix, what is €? 


$3. At the receiving end 


(We now return to binary codes.) Suppose the message u =u,--- uy is 
encoded into the codeword x =x,---x,, which is then sent through the 
channel. Because of channel noise, the received vector y = yi: :: y, may be 





8 Linear codes Ch. 1. §3. 


different from x. Let’s define the error vector 
e=y-X =e €. (10) 


Then e, 2 0 with probability 1 —p (and the i" symbol is correct), and e = 1 
with probability p (and the i" symbol is wrong). In the example of $1 p was 
equal to 1/100, but in general p can be anywhere in the range Ô= p < :. So we 
describe the action of the channel by saying it distorts the codeword x by 
adding the error vector e to it. 

The decoder (Fig. 1.3) must decide from y which message u or (usually 
simpler) which codeword x was transmitted. Of course it's enough if the 
decoder finds e, for then x = y—e. Now the decoder can never be certain 
what e was. His strategy therefore will be to choose the most likely error 
vector e, given that y was received. Provided the codewords are all equally 
likely, this strategy is optimum in the sense that it minimizes the probability 
of the decoder making a mistake, and is called maximum likelihood decoding. 

To describe how the decoder does this, we need two important definitions. 


MESSAGE 
SOURCE ENCODER DECODER 


CHANNEL 










\ 


UF Uy uk KER Xp yzx+e 
Qe ay x 
MESSAGE CODEWORD RECEIVED ESTIMATE 1 
VECTOR OF MESSAGE 
e= ey en 
ERROR 
VECTOR 


Fig. 1.3. The overall communication system. 


Definition. The (Hamming) distance between two vectors x —xi--- x, and 
y7yi::* y» is the number of places where they differ, and is denoted by 
dist(x, y). E.g. 


dist (10111, 00101) = 2, dist (0122, 1220) = 3 


(the same definition holds for nonbinary vectors). 


Definition. The (Hamming) weight of a vector x ^ x,--- x, is the number of 
nonzero x; and is denoted by wt (x). E.g. 


wt (101110) = 4, wt (01212110) 7 6. 
Obviously 
dist (x, y) = wt (x — y), (11) 


for both sides express the number of places where x and y differ. 
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Problem. (7) Define the intersection of binary vectors x and y to be the vector 
xc y= (Ki. ~~, Xan), 
which has 1’s only where both x and y do. E.g. 11001 * 10111 = 10001. Show 
that 
wt(x+y)=wt(x)+wt (y)—2 wt (x * y). (12) 


Now back to decoding. Errors occur with probability p. For instance 


Prob (e = 00000} = (1— p^, 
Prob (e = 01000} = p(1— p, 
Prob {e = 10010) = p*(1— p). 
In general if v is some fixed vector of weight a, 
Prob (e = v}=p*(1—p)"™. (13) 
Since p <3, we have 1 — p >p, and 
(1- py »p(-p)»p(1-py»---. 
Therefore a particular error vector of weight 1 is more likely than a particular 
error vector of weight 2, and so on. So the decoder's strategy is: 
Decode y as the nearest codeword x (nearest in Hamming distance), i.e.: 
Pick that error vector e which has least weight. 
This is called nearest neighbor decoding. 
A brute force decoding scheme then is simply to compare y with all 2* 
codewords and pick the closest. This is fine for small codes. But if k is large 


this is impossible! One of the aims of coding theory is to find codes which can 
be decoded by a faster method than this. 


The minimum distance of a code. The third important parameter of a code €, 
besides the length and dimension, is the minimum Hamming distance between 
its codewords: 


d = min dist (u, v) 
—-minwt(u-v) u€€,v€ €,ux v. (14) 


d is called the minimum distance or simply the distance of the code. Any two 
codewords differ in at least d places. 

A linear code of length n, dimension k, and minimum distance d will be 
called an [n, k, d] code. 

To find the minimum distance of a linear code it is not necessary to 
compare every pair of codewords. For if u and v belong to a linear code €, 
u —v =w is also a codeword, and (from (14)) 


d= min wt(w). 
weE.w #0 
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In other words: 


Theorem 1. The minimum distance of a linear code is the minimum weight of 
any nonzero codeword. 


Example. The minimum distances of codes #1, #2, #3. #6 are 3, 5, 2, 3 
respectively. 


How many errors can a code correct? 


Theorem 2. A code with minimum distance d can correct [Xd — 1)] errors.* If d 
is even, the code can simultaneously correct xd —2) errors and detect d[2 
errors. 


Proof. Suppose d = 3 (Fig. 1.4). The sphere of radius r and center u consists 
of all vectors v such that dist (u, v) « r. If a sphere of radius 1 is drawn 
around each codeword, these spheres do not overlap. Then if codeword u is 
transmitted and one error occurs, so that the vector a is received, then a is 
inside the sphere around u, and is still closer to u than to any other codeword 
v. Thus nearest neighbor decoding will correct this error. 

Similarly if d =2t+ 1, spheres of radius t around each codeword do not 
overlap, and the code can correct t errors. 

Now suppose d is even (Fig. 1.5, where d = 4). Spheres of radius Xd — 2) 
around the codewords are disjoint and so the code can correct Xd — 2) errors. 
But if d/2 errors occur the received vector a may be midway between 2 
codewords (Fig. 1.5). In this case the decoder can only detect that d/2 (or 
more) errors have occurred. Q.E.D. 


Thus code #1, which has minimum distance 3, is a single-error-correcting 
code. 





Fig. 1.4. A code with minimum distance 3 (@=codeword). 


*[x] denotes the greatest integer less than or equal to x. E.g. [3.5] = 3, [~ 1.5] = - 2. 
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d=4 


( Q = CODEWORD ) 


Fig. 1.5. A code with minimum distance 4. 


On the other hand, if more than d/2 errors occur, the received vector may 
or may not be closer to some other codeword than to the correct one. If it is, 
the decoder will be fooled and will output the wrong codeword. This is called 
a decoding error. Of course with a good code this should rarely happen. 

So far we have assumed that the decoder will always try to find the nearest 
codeword. This scheme, called complete decoding, is fine for messages which 
can’t be retransmitted, such as a photograph from Mars, or an old magnetic 
tape. In such a case we want to extract as much as possible from the received 
vector. 

But often we want to be more cautious, or cannot afford the most 
expensive decoding method. In such cases we might use an incomplete 
decoding strategy: if it appears that no more than l errors occurred, correct 
them, otherwise reject the message or ask for a retransmission. 

Error detection is an extreme version of this, when the receiver makes no 
attempt to correct errors, but just tests the received vector to see if it is a 
codeword. If it is not, he detects that an error has occurred and asks for a 
retransmission of the message. This scheme has the advantages that the 
algorithm for detecting errors is very simple (see the next section) and the 
probability of an undetected error is very low ($5). The disadvantage is that if 
the channel is bad, too much time will be spent retransmitting, which is an 
inefficient use of the channel and produces unpleasant delays. 


Nonbinary codes. Almost everything we have said applies equally well to 
codes over other fields. If the field F has q elements, then the message u, the 
codeword x, the received vector y, and the error vector 


e =y- X= eeen 


all have components from F. 

We assume that e; is 0 with probability 1 — p >3, and e; is any of the q — 1 
nonzero elements of F with probability p/(q — 1). In other words the channel 
is a q-ary symmetric channel, with q inputs, q outputs, a probability 1 — p > l/q 
that no error occurs, and a probability p « (q — 1)/q that an error does occur, 
each of the q — 1 possible errors being equally likely. 
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Tip 





1-p 
SEND RECEIVE 
p=ERROR PROBABILITY 


Fig. 1.6. The ternary symmetric channel. 


Example. If q = 3, the ternary symmetric channel is shown in Fig. 1.6. 


Problems. (8) Show that for binary vectors: 
(a) wt (x + y) = wt (x) - wt (y), (15) 
with equality iff x, = 1 whenever y; = 1. 
(b) wi(x+z)+wt(yt+z)t+wt(xt+ytz) 
=2wt(xtyt+x * y)—wt(z), (16) 


with equality iff it never happens that x, and y; are 0 and z; is 1. 
(9) Define the product of vectors x and y from any field to be the vector 


x * Y= (XY ... Xy). 


For binary vectors this is called the intersection - see Problem 7. Show that 
for ternary vectors, 


wt (x t y) = wt (x) + wt (y) — f(x * y), (17) 


where, if u = u,--- u, is a ternary vector containing a 0’s, b l's and c 2's, 
then f(x) = b * 2c. Therefore 


wt (x) + wt (y) -2 wt (x * y) € wt(x - y) & wt(x) + wt(y) - wt (x * y). (18) 


(10) Show that in a linear binary code, either all the codewords have even 
weight, or exactly half have even weight and half have odd weight. 
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(11) Show that in a linear binary code, either all the codewords begin with 
0, or exactly half begin with 0 and half with 1. Generalize! 

Problems 10 and 11 both follow from: 

(12) Suppose G is an abelian group which contains a subset A with three 
properties: (i) if a,, a, € A, then a,— a, € A, (ii) if b,, b; É A, then b; — b; € A, 
ii) if a € A, b A, then at b A. Show that either A=G, or A is a 
subgroup of G and there is an element bÆ A such that 


G - AU(b * A), 


where b + A is the set of all elements b +a,a EA. 
(13) Show that Hamming distance obeys the triangle inequality: for any 
vectors X= Xi Xy ym yit ' Yn Z7—72)'''ZQ 


dist (x, y) + dist (y, z) = dist (x, z). (19) 


Show that equality holds iff, for all i, either x, = y; or y; = z; (or both). 

(14) Hence show that if a code has minimum distance d, the codeword x is 
transmitted, not more than Xd- 1) errors occur, and y is received, then 
dist (x, y) « dist(y, z) for all codewords zx x. (A more formal proof of 
Theorem 2.) 

(15) Show that a code can simultaneously correct «a errors and detect 
a+1,...,b errors iff it has minimum distance at least a * b 4 1. 

(16) Show that if x and y are binary vectors, the Euclidean distance 
between the points x and y is 


(> (x — ny) = V/(dist (x, y)). 


(17) Combining two codes (I). Let G,,G be generator matrices for 
[n;, k, dı] and [n2, k, d;] codes respectively. Show that the codes with genera- 


tor matrices 
(^ o) 
0 G 


and (Gi|G;) are [ni + n2, 2k, min (d, d.}] and [ni + n2, k, d] codes, respectively, 
where d > d, + dz. 

(18) Binomial coefficients. The binomial coefficient (Z), pronounced ''x 
choose m," is defined by 


-0):(x-m* dei ANT 
2651) e m b if m is a positive integer, 
Nn m! 
(4) l, if m =0, 


0, otherwise, 





where x is any real number, and m!=1-2:3-...:(m—1)m,0!=1. The 
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reader should know the following properties: 
(a) If x =n is a nonnegative integer and n = m = 0, 


(m) = E m)" 


which is the number of unordered selections of m objects from a set of n 
objects. 

(b) There are (n) binary vectors of length n and weight m. There are 
(q — 1)"(7) vectors from a field of q elements which have length n and weight 
m. [For each of the m nonzero coordinates can be filled with any field element 
except zero.] 

(c) The binomial series: 


(a *- b) = > (ajan if n is a nonnegative integer; 


m=0 


(d+by = > (x )e. for |b| <1 and any real x; 


m=0 


4-5 (by > e t for |b] <1 and any real x. 


[Remember the student who, when asked to expand (a +b)" on an exam 
replied: ws", a+b)", (a+b), (a + b)", (a+ by, 
(d) Easy identities. (Here n and m are nonnegative integers, x is any real 


Gea) 

)=0 for m>n>0, 
eed 

Qu c cR 

| 

| 

| 


Bos (p) =z if n=1, 
m odd 


=0 if n zl, 
= 2" +0 +i)" +0-d)")i=V—-1. (20) 


(19) Suppose u and v are binary vectors with dist (u, v) = d. Show that the 
number of vectors w such that dist (u, w) - r and dist(v,w)=s is (201), 
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where i = (d +r — s)/2. If d+r—s is odd this number is 0, while if r+ s =d it 
is (2). 

(20) Let u, v, w and x be four vectors which are pairwise distance d apart; 
d is necessarily even. Show that there is exactly one vector which is at a 
distance of d/2 from each of u, v and w. Show that there is at most one vector 
at distance d/2 from all of u, v, w and x. 


$4. More about decoding a linear code 


Definition of a coset. Let € be an [n,k] linear code over a field with q 
elements. For any vector a, the set 


at€-latx:xe €) 


is called a coset (or translate) of €. Every vector b is in some coset (in b+ € 
for example). a and b are in the same coset iff (a - b) € €, Each coset 
contains q* vectors. 


Proposition 3. Two cosets are either disjoint or coincide (partial overlap is 
impossible). 


Proof. If cosets a+ € and b * € overlap, take v € (a + €) 'Y(b + €). Then 
v=a+x=b+y, where x and y € €, Therefore b at x-y-a-*x' (x €«), 
and so b+ € C a * €, Similarly a - € C b - €, and soa € - b - €, Q.E.D. 

Therefore the set F" of all vectors can be partitioned into cosets of: €; 


F"= € U(a,+ €) U(a.+ €)U--::U(a, * €) (21) 
where t = q" *—1. 


Suppose the decoder receives the vector y. y must belong to some coset in 
(21), say y =a; + x (x € €). What are the possible error vectors e which could 
have occurred? If the codeword x' was transmitted, the error vector is 
e€-y-x'-aitx—x' =a +x"Ea * €. We deduce that: 


the possible error vectors are exactly the vectors in 
the coset containing y. 


So the decoder's strategy is, given y, to choose a minimum weight vector é 
in the coset containing y, and to decode y as € — y — ê. The minimum weight 
vector in a coset is called the coset leader. (If there is more than one vector 
with the minimum weight, choose one at random and call it the coset leader.) 

We assume that the a;'s in (21) are the coset leaders. 
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The standard array. A useful way to describe what the decoder does is by a 
table, called a standard array for the code. The first row consists of the code 
itself, with the zero codeword on the left: 


xP =0,x%,...,2°, — (s q*) 


and the other rows are the other cosets a; + €, arranged in the same order and 
with the coset leader on the left: 


a; t x, a; - x?,..., ai +x. 


Example. Code # 4 (cont.). The [4,2] code with generator matrix 
1011 
me loil 
has a standard array shown in Fig. 1.7 (ignore the last column for the 
moment). 


message: 00 10 01 11 syndrome $ 


code: 0000 1011 0101 1110 (©) 
coset: 1000 0011 1101 0110 G) 
coset: 0100 1111 0001 1010 Q) 
coset: 0010 1001 0111 1100 (6) 
coset 
leaders 


Fig. 1.7. A standard array. 


Note that all 16 vectors of length 4 appear, divided into the 4 cosets 
forming the rows, and the coset leaders are on the left. 

Here is how the decoder uses the standard array: When y is received (e.g. 
1111) its position in the array is found. Then the decoder decides that the 
error vector é is the coset leader found at the extreme left of y (0100), and y is 
decoded as the codeword £= y — ê= 1011 found at the top of the column 
containing y. (The corresponding message is 10.) 

Decoding using a standard array is maximum likelihood decoding. 


The syndrome. There is an easy way to find which coset y is in: compute the 
vector 
S = Hy", 


which is called the syndrome of y. 
Properties of the syndrome. (1) S is a column vector of length n — k. 


(2) The syndrome of y, S = Hy", is zero iff y is a codeword (by definition 
of the code). So if no errors occur, the syndrome of y is zero (but not 





Ch. 1. $4. More about decoding a linear code 17 


conversely). In general, if y ^ x * e where x € €, then 
S = Hy" = Hx" + He" = He". (22) 
(3) For a binary code, if there are errors at locations a, b,c,..., so that 
eo Or S018 &xe de rad ce 0 


then from Equation (22), 
S= > eH, (H, = i^ column of H) 


= H+H, +H. +... 


In words: 


Theorem 4. For a binary code, the syndrome is equal to the sum of the 
columns of H where the errors occurred. [Thus S is called the “syndrome” 
because it gives the symptoms of the errors.] 


(4) Two vectors are in the same coset of € iff they have the same 
syndrome. For u and v are in the same coset iff (u — v) € € iff H(u — v)" =0 
iff Hu“ = Hv". Therefore: 


Theorem 5. There is a 1-1-correspondence between syndromes and cosets. 


For example, the cosets in Fig. 1.7 are labeled with their syndromes. 

Thus the syndrome contains all the information that the receiver has about 
the errors. 

By Property (2), the pure error detection scheme mentioned in the last 
section just consists of testing if the syndrome is zero. To do this, we 
recompute the parity check symbols using the received information symbols, 
and see if they agree with the received parity check symbols. Le., we 
re-encode the received information symbols. This only requires a copy of the 
encoding circuit, which is normally a very simple device compared to the 
decoder (see Fig. 7.8). 


Problems. (21) Construct a standard array for code #1. Use it to decode the 
vectors 110100 and 111111. 

(22) Show that if € is a binary linear code and a£ €, then € U (a + €) is 
also a linear code. 
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§5. Error probability 


When decoding using the standard array, the error vector chosen by the 
decoder is always one of the coset leaders. The decoding is correct if and only 
if the true error vector is indeed a coset leader. If not, the decoder makes a 
decoding error and outputs the wrong codeword. (Some of the information 
symbols may still be correct, even so.) 


Definition. The probability of error, or the word error rate, Pan, for a particular 
decoding scheme is the probability that the decoder output is the wrong 
codeword. 


) 


If there are M codewords x‘”,...,x°”, which we assume are used with 


equal probability, then 
M 
Por = u > Prob (decoder output # x? | x was sent}. (23) 
i-1 
If the decoding is done using a standard array, a decoding error occurs iff e 
is not a coset leader, so 
P. = Prob {e coset leader}. 


Suppose there are a; coset leaders of weight i. Then (using (13)) 
Perr = 1-2 ap(1- p) Q4) 


(Since the standard array does maximum likelihood decoding, any other 
decoding scheme will have Pen = (24).) 


Examples. For code #4 (Fig. 1.7), a= 1, a, = 3, so 
Per = 1—(1— p)*—3p(1—p)’ 


= 0.0103... if p= <1. 


For code #1, ao= 1, a, = 6, a2= 1, so 


Per = 1-(1—p)*~ 6p(1 — py — pX1— p)* 
1 


= 0.00136... ifp = 100 


If the code has minimum distance d = 2t + 1 or 2t +2, then (by Theorem 2) 
it can correct t errors. So every error vector of weight =t is a coset leader. 
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Le. 
n : 
a = (7) for 0si=t. (25) 


But for i >t the a; are extremely difficult to calculate and are known for very 

few codes. 

_ If the channel error probability p is small, 1—p c1 and p'(1—-p)"'» 
p (1— p)" ^, In this case the terms in (24) with large i are negligible, and 


Pen 1- 2 (T)e'a- pr^ (26) 
or 
A L /n i n-i m n-i-1 
Pati- (a-p) - a ap! - p) (27) 


are useful approximations. In any event the RHS of (26) or (27) is an upper 
bound on Pen. 


Perfect codes. Of course, if o; — 0 for it =[(d — 1)/2], then (26) is exact. 
Such a code is called perfect. 

Thus a perfect t-error-correcting code can correct all errors of weight s t, 
and none of weight greater than t. Equivalently, the spheres of radius t 
around the codewords are disjoint and together contain all vectors of length a. 

We shall see much more about perfect codes in Chs. 6, 20. 


Quasi-perfect codes. On the other hand, if o; 2 0 for i>t+1, then (27) is 
exact. Such a code is called quasi-perfect. 

Thus a quasi-perfect t-error-correcting code can correct all errors of 
weight = t, some of weight f+ 1, and none of weight >t + 1. The spheres of 
radius t+ 1 around the codewords may overlap, and together contain all 
vectors of length n. 

We shall meet quasi-perfect codes again in Chapters 6, 9 and 15. 


Sphere-packing or Hamming bound. Suppose € is a binary code of length n 
containing M codewords, which can correct t errors. The spheres of radius t 
around the codewords are disjoint. Each of these M spheres contain 1 + (5) + 
‘+++(7) vectors (see Problem 18(b)). But the total number of vectors in the 
space is 2". Therefore we have established: 


Theorem 6. (The sphere-packing or Hamming bound.) 
A t-error-correcting binary code of length n containing M codewords must 


satisfy 
«(Qe Qe ^ 
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Similarly, for a code over a field with q elements, 


1+(q-1(")+---+¢q-(")) <q" (29) 
1 t 


For large n see Theorem 32.of Ch. 17. Since the proof of these two bounds 
did not assume linearity, they also hold for nonlinear codes (see Ch. 2). 
By definition, a code is perfect iff equality holds in (28) or (29). 


Symbol error rate. Since some of the message symbols may be correct even if 
the decoder outputs the wrong codeword, a more useful quantity is the symbol 
error rate, defined as follows. 


Definition. Suppose the code contains M codewords x@=x{?---x?, i= 
1,..., M, and the first k symbols x - - - x? in each codeword are information 
syuribóls: Let € = £,--- £, be the decoder output. Then the symbol error rate 
Pym is the average probability that an information symbol is in error after 
decoding: 


k M 
Pens = zl > 2 Prob ($;z xf? | x? was sent} (30) 


Problem. (23) Show that if standard array decoding is used, and the messages 
are equally likely, then the number of information symbols in error after 
decoding does not depend on which codeword was sent. Indeed, if the codeword 


x =X,---X, is sent and is decoded as Å = £,--- ĉn, then 
k 
Pia i > Prob (£; xj), G1) 
j=l 
= 4D fe) Prob {e}, (32) 


where f(e) is the number of incorrect information symbols after decoding, if 
the error vector is e, and so 


1 DK 
Pos Da FP (c), 


where F, is the weight of the first k places of the codeword at the head of the i™ 
column of the standard array, and P (c;) is the probability of all binary vectors 
in this column. 


Example. The standard array of Fig. 1.7. Here f(e) = 0 if e is in the first column 
of the standard array (a coset leader), = 1 if e is in columns 2 or 3, and = 2 if e is 
in the last column. From (32): 


Paymo = [1 - (p* + 3p°q + 3p'q? + pq) -2-(p'q +3p’q’)], q—1—p. 
= 0.00530... if p = 1/100. 
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Using this very simple code has lowered the average probability of error per 
symbol from 0.01 to 0.0053. 


Problems. (24) Show that for code #1 with standard array decoding 


Pos = (22p7q* + 36p?q? + 24p'q? + 12p!q + 2p5)/3 
= 0.00072... if p 1/100. 


As these examples suggest, P.,mp is difficult to calculate and is not known for 
most codes. 
(25) Show that for an [n, k] code, 
l 
k 


Incomplete decoding. An incomplete decoding schemé which corrects «x! 
errors can be described in terms of the standard array as follows. Arrange the 
cosets in order of increasing weight (i.e. decreasing probability) of the coset 
leader (Fig. 1.8). 


Pam Pos Pa 


correct errors (coset 
leaders of weight xl). 


detect errors (coset 
leaders of weight > l). 





Fig. 1.8. Incomplete decoding using a standard array. 


If the received vector y lies in the top part of the array, as before y is decoded 
as the codeword found at the top of its column. If y lies in the bottom half, the 
decoder just detects that more than l errors have occurred. 


Error detection. When error detection is being used the decoder will make a 
mistake and accept a codeword which is not the one transmitted iff the error 
vector is a nonzero codeword. If the code contains A; codewords of weight i, 
the error probability is 


Pa = >, Ap (01 py. (33) 
i-i 
P4 
The probability that an error will be detected and the message retransmitted is 


Pas. = 1 T a a py x Pas (34) 


Example. For code #1, A; 1, A; = 4, Aa= 3, and 
Pa = 4p°(1 — py * 3p*(1— py 


2 fpl 
= 0.00000391 ... if p = 7. 
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This is very much smaller than the error probability of 0.00136 obtained from 
standard array decoding of this code. The retransmission probability is 
Pretrans. = 0.0585... . 


Research Problem 1.1. Find the distribution (o;) of coset leaders for any of the 
common families of codes. This is still unsolved even for first-order Reed 
Muller codes - see Chapter 14, Berlekamp and Welch [133], Lechner [797—799], 
Sarwate [1144] and Sloane and Dick [1233]. 


$6. Shannon's Theorem on the existence of good codes 


In the last section we saw that the weak code #4 reduced the average 
probability of error per symbol from 0.01 to 0.00530 (at the cost of using 4 
symbols to send 2 message symbols), and that code # 1, which also had rate 1, 
further reduced it to 0.00072 (at the cost of using 6 symbols to send 3 message 
symbols). 

In general we would like to know, for a given rate R — k/n, how small we can 
make P.ym with an [n, k] code. The answer is given by a remarkable theorem of 
Shannon, which says that Pern (and hence Pym», by Problem 25) can be made 
arbitrarily small, provided R is less than the capacity of the channel. 


Definition. The capacity of a binary symmetric channel with error probability p 
(Fig. 1.1) is (see Fig. 1.9) 


C(p) = 1+ plog; p *(1— p)log;(l— p). (35) 


Theorem 7. (Shannon's Theorem; proof omitted.) For any e >0, if R< C(p) 
and n is sufficiently large, there is an [n, k] binary code of rate kin = R with error 
probability Pen < e. 


CAPACIT Y 
C(p) 


Oo 
0 1/2 1 
p 


Fig. 1.9. Capacity of binary symmetric channel. 
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(A similar result holds for nonbinary codes, but with a different definition of 
capacity.) 

Unfortunately this theorem has so far been proved only by probabilistic 
methods and does not tell how to construct these codes. 

In practice (as we have seen) it is difficult to find Per and Pym, so the 
minimum distance d is used as a more convenient measure of how good the 
code is. For then the code can correct [Xd — 1)] errors (Theorem 2), and (26) is a 
good approximation to Pes, especially if p is small. 

So one version of the main problem of coding theory is to find codes with 
large R (for efficiency) and large d (to correct many errors). Of course these are 
conflicting goals. Theorem 6 has already given an upper bound on the size of a 
code, and other upper bounds will be given in Ch. 17 (see especially Fig. 17.7). 
On the other hand Theorem 12 at the end of the chapter is a weaker version of 
Theorem 7 and implies that good linear codes exist. However, at the present 
time we do not know how to find such codes. 

Of course for practical purposes one also wants a code which can be easily 
encoded and decoded. 


§7. Hamming codes 


The Hamming single-error-correcting codes are an important family of 
codes which are easy to encode and decode. In this section we discuss only 
binary Hamming codes. 

According to Theorem 4, the syndrome of the receiver vector is equai to the 
sum of the columns of the parity check matrix H where the errors occurred. 
Therefore to design a single-error-correcting code we should make the columns 
of H nonzero (or else an error in that position would not affect the syndrome 
and would be undetectable) and distinct (for if two columns of H were equal, 
errors in those two positions would be indistinguishable). 

If H is to have r rows (and the code to have r parity checks) there are only 
2' — | columns available, namely the 2' — 1 nonzero binary vectors of length r. 
E.g. if r= 3, there are 2-1 - 7 columns available: 


l 
l (36) 
1 


the binary representations of the numbers 1 to 7. For a Hamming code we use 
them all, and get a code of length n = 2' - I. 


Definition. A binary Hamming code X, of length n = 2’ — 1 (rz 2) has parity 
check matrix H whose columns consist of all nonzero binary vectors of length 
r, each used once. &, is an [n = 2—1, k-«2'-1-r. d =3] code. 
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Example. Code #5, the [7, 4, 3] Hamming Code #3, with 


ottoon 
H=|0110011 
1010101 en 


Here we have taken the columns in the natural order (36) of increasing binary 
numbers. To get H in the standard form of (2), we take the columns in a different 
order: 


0111100 
H'|10110190] G8) 
110100 
In general, 


H'=[A|f] (39) 


where A contains all columns with at least two 1’s. 
Obviously changing the order of the columns doesn't affect the number of 
errors a code can correct or its error probability. 


Definition. Two codes are called equivalent if they differ only in the order of 
the symbols. E.g. 


0000 0000 
0011, 0101 
1100 1010 


1111 1111 


are equivalent [4,2,2] codes. So H and H' give equivalent codes. 


Problems. (26) Show that the generator matrices 


| 1 l l 
G= 11 |, G'= 1 |! 
11 11 


generate equivalent codes. 
(27) Show that 


11 111111 
G= 11 ; G'= 1] 1] 
1] 1 l 


generate equivalent codes. 
(28) Show that the Hamming code 26, is unique, in the sense that any linear 
code with parameters [2 - 1,2 —1 — 7, 3] is equivalent to %,. 
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There may be very good engineering or aesthetic reasons for preferring one 
code to another which is equivalent to it. For example a third parity check 
matrix H for code £5 is 


100 
010 (40) 
10 


-the same columns, but in yet another order. Now the code is cyclic, i.e. a 
cyclic end-around shift of a codeword is again a codeword. We shall see later 
(Ch. 7) that binary Hamming codes can always be made cyclic. 


Problem. (29) Code # 7, the Hamming [15, 11, 3] code #,. Give the three forms 
of H, corresponding to (37), (38) and (40). Using the (38) form, encode the 
message u — 11111100000, and decode the vector 111000111000111. 


Decoding. Suppose we use H in the form (37), with the i column H, = binary 
representation of i. If there is a single error in the /-th symbol, then from (22) the 
syndrome is S = Hy" = He" = H, = binary representation of |. So decoding is 
easy! (It will never be this easy again.) 

Since the code can correct any single error, it has minimum distance d = 3 (by 
Theorem 2). In fact d is equal to 3, for it's easy to find codewords of weight 3 (e.g. 
11100 - -- 0 if H is (37)). See also Theorem 10 below. 


Theorem 8. The Hamming codes are perfect single-error-correcting codes. 


Proof. Since the code can correct single errors, the spheres of radius 1 around 
the codewords are disjoint. There are n + 1 = 2” vectors in each sphere, and 
there are 2* 22? '" spheres, giving a total of 2"^' — 2" vectors. So every 
vector of length n is in one of the spheres and the code is perfect. Q.E.D. 


Summary of Properties of Hamming Code 3C. 
length n -2'-1 (for r=2,3,...), 
dimension k = 2”—1—r, 
number of parity checks = r, 
minimum distance d = 3. 


3€, is a perfect single-error-correcting code, and is unique up to 
equivalence. The parity check matrix H is an r X n matrix whose 
columns are all nonzero r-tuples. 





Fig. 1.10. 
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Problems. (30) Write down a generator matrix G for #, and use it to show 
every nonzero codeword has weight > 3. [Hint: if a codeword has weight <2 
it must be the sum of <2 rows of G.] 

(31) Show that the distribution of coset leaders for a Hamming code is 
Qo = 1, a; n. What is the error probability Per? 


§8. The Dual Code 


If u =u,- +- Un, V= Vi v, are vectors (with components from a field F), 
their scalar product is 
Ue v= Uv, tt ust. (41) 


(evaluated in F). For example the binary vectors u = 1101, v = 1111 have 
u-v=1+1+04+1=1. 
If u-v=0, u and v are called orthogonal. 


Problem. (32) For binary vectors u-v=0 iff wt(u * v) is even, =1 iff 
wt (u * v) is odd. Also u - u =0 iff wt(u) is even. 


As this problem shows, the scalar product in finite fields has rather 
different properties from the scalar product of real vectors used in physics. 


Definition. If € is an [n, k] linear code over F, its dual or orthogonal code €* 
is the set of vectors which are orthogonal to all codewords of €: 


€` ={u|u-v =0 for all v E €). (42) 


Thus from §2, €' is exactly the set of all parity checks on €. If € has 
generator matrix G and parity check matrix H, then €^ has generator 
matrix = H, and parity check matrix = G. Thus €* is an [n,n — k] code. € is 
the orthogonal subspace to €. (We shall discuss the minimum distance of €^ 
in Ch. 5.) 


Problems. (33) (a) Show that (€+) = €. (D) Let €+ 2 - (u*v:uc€ €,vcg). 
Show that (€ - 9) — € n9'. 

(34) Show the dual of the [n,1,n] binary repetition code (#2) is the 
[n,n — 1,2] even weight code ( # 3). 

If € C €', we call € weakly self dual (w.s.d.), while if € = €^, € is called 
(strictly) self dual. 

Thus € is w.s.d. if u -v — 0 for every pair of (not necessarily distinct) 
codewords in €. € is self-dual if it is w.s.d. and has dimension k = :n (so n 
must be even). 
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For example the binary repetition code #2 is w.s.d. iff n is even. When 
n — 2, the repetition code (00, 11) is self-dual. So is the ternary code # 6. 


Problems. (38) Construct binary self-dual codes of lengths 4 and 8. 

(36) If n is odd, let € be an [n,i(n—1)] w.s.d. binary code. Show 
€+=€U {1+ €), where 1 is the vector of all I's. 

(37) Show that the code with parity check matrix H = [A | I] over any field 
is strictly self-dual iff A is square matrix such that AA" — — I. 

(38) If € is a binary w.s.d. code, show that every codeword has even 
weight. Furthermore, if each row of the generator matrix of € has weight 
divisible by 4, then so does every codeword. 

(39) If € is a ternary w.s.d. code, show that every codeword has Hamming 
weight divisible by 3. 


$9. Construction of new codes from old (II) 


(1) Adding an overall parity check. Let € be an [n, k, d] binary code in which 
some codewords have odd weight. We form a new code € by adding a 0 at 
the end of every codeword of € with even weight, and a 1 at the end of every 
codeword with odd weight. € has the property that every codeword has even 
weight, i.e. it satisfies the new parity check equation 


xit Xo t. Xn = O, 


the "overall" parity check. 

From (12), the distance between every pair of codewords is now even. If 
the minimum distance of € was odd, the new minimum distance is d + 1, and 
€ is an [n + I, k, d + 1] code. This technique, of adding more check symbols, 
is generally called extending a code. 

If € has parity check matrix H, € has parity check matrix 





Example. Code #8. Adding an overall parity check to code #5 gives the 
[8, 4, 4] extended Hamming code with 


43) 
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Since this has d = 4, according to Theorem 2 it can correct any single error 
and detect any double error. Indeed, recalling that the syndrome S is the sum 
of the columns of H where the errors occurred, we have the following 
decoding scheme. If there are no errors, S = 0. If there is a single error, at 
location i, then 


S= 


, 


1 
X 
y 
2 


where (xyz) is the binary representation of i (see (43)). Finally if there are two 
errors, 


ty 


and the decoder detects that two (or more) errors have occurred. 


Problems. (40) Show that code #8 is strictly self-dual. 

(41) Show that the extended Hamming code is unique (in the same sense as 
Problem 28). 

For nonbinary codes the same technique may or may not increase the 
minimum distance. 

(42) If one adds an overall pariiy check (i.e. make the codewords satisfy 
=, x; = 0 mod 3) to the ternary codes with generator matrices 


pea - Es 
00111]. ?P* |ooi22]* 


what happens to the minimum distance? 


(Il) Puncturing a code by deleting coordinates. The inverse process to ex- 
tending a code € is called puncturing, and consists of deleting one or more 
coordinates from each codeword. E.g. puncturing the [3, 2, 2] code #9, 


by deleting the last coordinate gives the [2,2, 1] code 
00 


10 
11 
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The punctured code is usually denoted by €*. 

In general each time a coordinate is deleted, the length n drops by 1, the 
number of codewords is unchanged, and (unless we are very lucky) the 
minimum distance d drops by 1. 


(II) Expurgating by throwing away codewords. ' The commonest way to 
expurgate a code is the following. Suppose € is an [n, k, d] binary code 
containing codewords of both odd and even weight. Then it follows from 
Problem 10 that half the codewords have even weight and half have odd 
weight. We expurgate € by throwing away the codewords of odd weight to 
get an [n, k — 1, d'] code. Often d' ^ d (for instance if d is odd). 


Example. Expurgating the [7,4,3] code #5 gives a [7, 3, 4] code. 


(IV) Augmenting by adding new codewords. The commonest way to augment 
a code is by adding the all-ones vector 1, provided it is not already in the 
code. This is the same as adding a row of 1’s to the generator matrix. If € is 
an [n, k, d] binary code which does not contain 1, the augmented code is 


E= EU {1+ 8}. 


Le. € consists of the codewords of € and their complements, and is an 
[n, k + 1, d] code, where 


d'? = min (d, n — d'), 


and d’ is the largest weight of any codeword of €. 


Example. Augmenting code #9 gives the [3,3, 1] code consisting of all 
vectors of length 3. 


(V) Lengthening by adding message symbols. The usual way to lengthen a 
code is to augment it by adding the codeword 1, and then to extend it by 
adding an overall parity check. This has the effect of adding one more 
message symbol. 


(VI) Shortening a code by taking a cross-section. An inverse operation to the 
lengthening process just described is to take the codewords which begin x, = 0 
and delete the x; coordinate. This is called taking a cross-section of the code, 
and will be used in later chapters to shorten nonlinear codes. 


Dual of Hamming code. We illustrate these six operations by performing them 
on the Hamming code (Fig. 1.11) and its dual (Fig. 1.13). 
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EXTENDED HAMMING CODE 
fatat—1-r,4] 
e.g. [16, 11,4] 








BY 










CROSS - SECTION 


PUNCTURE LENGTHEN z 
X470 


PARITY CHECK 






HAMMING CODE Xr EXPURGATE 


EVEN WEIGHT SUBCODE OF 3f, 
[2'- 4,2 -2-r,4] 


eg. [15,10,4] 










[2-4,21-4- 53) 
e.g. [1545.3] AUGMENT 
(ADD 1) 








Fig. 1.11. Variations on a Hamming code. 


Definition. The binary simplex code F, is the dual of the Hamming code %,. By 
$8 we know F, is a [2 — 1l, r] code with generator matrix G, which is the 
parity check matrix of 26, E.g. for 5^, 


011 
a= [tnl 


and the codewords are 


000 
_ oul 
$i 
110 
For Fz, 
000 | 1111 
on Fen 
s,- | wn] 
.[9 1 nt 
“tG lol Gl 


and the codewords are 





=== = | OOOO 
— mi eat OOO CO 
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We see by induction that Z, consists of 0 and 2’ — 1 codewords of weight 
2'^, (The reader may recognize this inductive process as one of the standard 
ways of building Hadamard matrices — more about this in the next chapter.) 

This is called a simplex code, because every pair of codewords is the same 
distance apart. So if the codewords were marked at the vertices of a unit cube 
in n dimensions, they would form a regular simplex. E.g. when r=2, $,— 
code #9 forms a regular tetrahedron (the double lines in Fig. 1.12) 


011 


Code #9 drawn 
as a tetrahedron. 





Fig. 1.12. 


The simplex code &, will also reappear later under the name of a maximal- 
length feedback shift register code (see $4 of Ch. 3 and Ch. 14). 







1ST ORDER REED-MULLER CODE 
[2r 4,2171] 
e.g. þe, 5,8 ] 










EXTEND BY 
ADDING OVERALL 
PARITY CHECK 


CROSS - SECTION 
x4=0 





EXPURGATE 







PUNCTURED REED-MULLER CODE 






SIMPLEX CODE 
{ar-4,7,2°7* ] 
eg. (15, 4,8] 






[2 4,7 4,274] 


AUGMENT eg. [15,5,7] 


(ADD 4) 





Fig. 1.13. Variations on the simplex code. 


The dual of the extended Hamming code is also an important code, for it is 
a first-order Reed-Muller code (see Ch. 13). It is obtained by lengthening f, as 
described in (V). For example lengthening 4, in this way we obtain the code in 
Fig. 1.14. 


Problem. (43) A signal set is a collection of real vectors x = (x, + + + x,). Define 
x+y = Xii xy: (evaluated as a real number) and call x - x the energy of x. The 
unit vectors s‘”,...,s‘ (where s^ has a 1 in the i" component and 0 
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00/00 
oii 
0011/0011 
0110/0110 
0000 1111 
0101 1010 
0011 1100 
0110 1001 
1111 1111 
1010 1010 
11001100 
1001 1001 
1111 0000 
10100101 
1100 0011 
10010110 


Fig. 1.14. Code # 10, an [8, 4, 4] Ist order Reed-Muller Code. 


0000 
0101 





elsewhere) form an orthogonal signal set, since s? - s? = ô; Consider the 
translated signal set (t? = s? — a). Show that the total energy Ei. t? - t? is 
minimized by choosing 
(n) 
a=(-,...,—). 
n n 


The resulting (^) is called a simplex set (and is the continuous analog of the 
binary simplex code described above). The biorthogonal signal set {+ s?) is 
the continuous analog of the first order Reed-Muller code. 


$10. Some general properties of a linear code 


To conclude this chapter we give several important properties of linear 
codes. The first three apply to linear codes over any field. 


Theorem 9. If H is the parity check matrix of a code of length n, then the code 
has dimension n — r iff some r columns of H are linearly independent but no 
r * 1 columns are. (Thus r is the rank of H). 


This requires no proof. 
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Theorem 10. If H is the parity check matrix of a code of length n, then the code 
has minimum distance d iff every d—1 columns of H are linearly independent 
and some d columns are linearly dependent. 


Proof. There is a codeword x of weight w iff Hx" — 0 for some vector x of 
weight w iff some w columns of H are linearly dependent. Q.E.D. 


Theorem 10 gives another proof that Hamming codes have distance 3, for 
the columns of H are all distinct and therefore any 2 are independent, and 
there are three columns which are dependent. 


Problem. (44) Let G be the generator matrix for an [n, k, d] code €. Show that 
any k linearly independent columns of G may be taken as information 
symbols - in other words, there is another generator matrix for € in which 
these k columns form a unit matrix. 


Theorem 11. (The Singleton bound.) If € is an [n, k, d] code, then n — k >= 
d — 1. 


1" Proof. r ^ n — k is the rank of H and is the maximum number of linearly 
independent columns. 


2" Proof. A codeword with only one nonzero information symbol has weight 
at most n — k + 1. Therefore d «&n —k- 1l. Q.E.D. 


Codes with r= d —1 are called maximum distance separable (abbreviated 
MDS), and are studied in Chapter 11. 

Theorems 6 and 11 have provided upper bounds on the size of a code with 
given minimum distance. Our final theorem is a lower bound, which says that 
good linear codes do in fact exist. 


Theorem 12. (The Gilbert-Varshamov bound.) There exists a binary linear 
code of length n, with at most r parity checks and minimum distance at least d, 
provided 


e^) (er. (44) 


Proof. We shall construct an r X n matrix H with the property that no d — 1 
columns are linearly dependent. By Theorem 10, this will establish the 
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theorem. The first column can be any nonzero r-tuple. Now suppose we have 
chosen i columns so that no d — 1 are linearly dependent. There are at most 


ores) 


distinct linear combinations of these i columns taken d — 2 or fewer at a time. 
Provided this number is less than 2' — 1 we can add another column different 
from these linear combinations, and keep the property that any d — 1 columns 
of the new r X (i * 1) array are linearly independent. We keep doing this as 
long as 


i8 (Doce Q.E.D. 


For large n see Theorem 30 and Fig. 17.7 of Ch. 17. 


Problem. (45) The Gilbert-Varshamov bound continued. Prove that there 
exists a linear code over a field of q elements, having length n, at most r 
parity checks, and minimum distance at least d, provided 


E a-0(";')<a’ 


$11. Summary of Chapter 1 


An (n, k, d] binary linear code contains 2“ codewords x = x,::, Xn, x, - 0 or 
1, and any two codewords differ in at least d places. The code is defined 
either as those codewords x such that Hx" —0, where H is a parity check 
matrix, or as all linear combinations of the rows of a generator matrix G. Such 
a code can correct [id —1)] errors. Maximum likelihood decoding is done 
using a standard array. At rates below the capacity of the channel, the error 
probability can be made arbitrarily small by using sufficiently long codes. 

A binary Hamming code X, is a perfect single-error-correcting code with 
parameters [n = 2 - ], k-2-1—-r, d =3], r2, and is constructed from a 
parity check matrix H whose columns are all 2' — 1 distinct nonzero binary 
r-tuples. 


Notes on Chapter 1 


$1. The excellent books by Abramson [3], Gallager [464], Golomb [522] and 
Wozencraft and Jacobs [1433] show in more detail how codes fit into com- 
munication systems. 
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The following papers deal with the more practical aspects of coding theory 
and the use of codes on real channels. Asabe et al. [31], Baumert and 
McEliece [88,89], Blythe and Edgcombe [167], Borel [171], Brayer et al. 
[191-194], Buchner [209], Burton [214], Burton and Sullivan [217], Chase 
[264, 265], Chien et al. [282,287], Corr [309], Cowell [311], Dorsch [380], 
Elliott [408], Falconer [414], Forney [438], Franaszek [448], Franco and 
Saporta [449], Fredricksen [450], Freiman & Robinson [458], Frey and 
Kavanaugh [460], Goodman and Farrell [535], Hellman [639], Hsu and Kasami 
[671], the special issues [678,679], Jacobs [686], Kasami et al. [735,742], 
Klein and Wolf [769], Lerner [818], Murthy [977], Posner [1071], Potton 
[1073], Ralphs [1088], Rocher and Pickholtz [1120], Rogers [1121], Schmandt 
[1156], Tong [1333-1336], Townsend and Watts [1337], Tróndle [1340] and 
Wright [1434]. 

The following are survey articles on coding theory: Assmus and Mattson 
[47], Bose [176], Dobrushin [378], Goethals [494], Jacobs [686], Kasami [724], 
Kautz [749], Kautz and Levitt [753], Kiyasu [762], MacWilliams [876], Sloane 
[1225], Viterbi [1374], Wolf [1429], Wyner [1439] and Zadeh [1450]. 

We have described the channel as being a communication path, such as a 
telegraph line, but codes can be applied equally well in other situations, for 
example when data is stored in a computer and later retrieved. Various 
applications of codes to computers are described by Brown and Sellers [203], 
Chien [278, 281, 284], Davydov [337], Davydov and Tenengol'ts [338], Fischler 
[430], Hong and Patel [662], Hsiao et al. [668—670], Kasahara et al. [720], 
Lapin [794], Malhotra and Fisher [890], Oldham et al. [1011], Patel and Hong 
[1029] and Sloane [1230]. 

We usualy consider only a binary symmetric channel, which has no 
memory fron: one symbol to the next, and doesn't lose (or add) symbols. Most 
real channels are not like this, but have bursts of noise, and lose synchroniza- 
tion. (For example, one channel has been described as ‘‘a very good channel, 
with errors predominantly due to a noisy Coke machine near the receiver" 
[438].) A lot of work has been done on using codes for synchronizing — see for 
example Bose and Caldwell [182], Brown [202], Calabi and Hartnett [226, 
227], Eastman and Even [400], Freiman [456], Freiman and Wyner [459], 
Gilbert [482], Golomb et al. [526, 527], Hatcher [627], Hellman [640], Kautz 
[751], Levenshtein [820-822], Levy [829], Mandelbaum [900], McEliece [943], 
Pierce [1045], Rudner [1129], Scholtz et al. [1163-1166], Shimo et al. [1197], 
Shiva and Seguin [1203], Stanley and Y oder [1263], Stiffler [1278-1280], Tanaka 
and Kasai [1302], Tavares and Fukuda [1312, 1313], Tong [1333], Ullman [1358] 
and Varshamov [1362-1365]. See also the papers on comma-free codes 
mentioned in the introduction. Burst-correcting codes are discussed in Ch. 10. 


82. Other kinds of codes. Our codes are block codes: a block of message 
symbols becomes a codeword. Very powerful rival codes, not considered 
here, are convolutional and tree codes, which encode the message symbols 
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continuously without breaking them up into blocks. These can be very 
efficiently decoded either by sequential decoding or by the Viterbi algorithm. 
See Forney [439, 441-444], Forney and Bower [445], Heller and Jacobs [638], 
Jelinek [691], Viterbi [1372, 1373] and Wu [1435, 1436]. 

On the other hand cryptographic codes have little in common with our 
codes: their goal is to conceal information, whereas our codes have the 
opposite aim. See Feistel et al, [422,423], Gaines [463], Geffe [469, 470], 
Gilbert et al. [484], Hellman [640a], Kahn [707] and Shannon [1190]. 

Another class of codes are those for correcting errors in arithmetic 
operations (see for example Peterson and Weldon [1040] and Rao [1093]). 


§3. Slepian [1224] calls Fig. 1.3 the information theorist’s coat-of-arms. For 
binomial coefficients see Riordan [1114]. Problem 19 is from Shiva [1199]. 


$4. Linear codes. The basic papers are by Slepian [1217-1219]. To help with 
synchronizing, sometimes a coset of the code is used rather than the code 
itself - see Posner [1071, Fig. VII]. 


$5. Theorem 6 is due to Hamming [592]. For lots more about the error 
probability of codes see for example Batman and McEliece [80], Cain and 
Simpson [224], Crimmins et al. [317-319], Gallager [464], Jelinek [690], 
Leont'ev [817], Posner [1070], Redinbo and Wolf [1103], Slepian [1220], 
Sullivan [1292] and Wyner [1437, 1438]. Hobbs [656] gives an approximation 
to the distribution of coset leaders of any code. 


$6. For a proof of Shannon's Theorem (Shannon [1188, 1189]) see Gallager 
[464, $6.2]. 


Remark. Incomplete decoding (see Fig. 1.8) which corrects all error patterns 

of weight =! = [(d — 1)/2] and no others is called bounded distance decoding. It 

follows from the Elias or McEliece-Rodemich-Rumsey-Welch bounds: 
(Theorem 34, 35, or 37 of Ch. 17) that bounded distance decoding does not 

achieve channel capacity - for details see Wyner [1437] or Forney [437]. On 
the other hand only a slightly more complicated decoding scheme will achieve 

capacity (e.g. Abramson [3, p. 167]. 


$7. Hamming codes were discovered by Golay [506,514a] and Hamming 
[592]; see also Problem 8 of Ch. 7. 


$8. In the language of vector spaces, a vector u is called isotropic if u : u — 0, 
and a weakly self dual code is called totally isotropic. (See for example Lam 
[790]). 
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§9. Deza [373], Farber [416], Hall [580], Hall et al. [581], Landau and Slepian 
[792], Van Lint [852] and Tanner [1308, 1309] study simplex codes. For more 
about Problem 43 see Wozencraft and Jacobs [1433, p. 257]. 


$10. Although Theorem 11 is nowadays usually called the Singleton bound 
(referring to Singleton [1214]), Joshi [700] attributes this result to Komamiya 
[775]. It applies also to nonlinear codes (Problem 3 of Ch. 11 and Problem 17 
of Ch. 17). For Theorem 12 see Gilbert [479], Varshamov [1362] and Sacks 
[1140]. 


Nonlinear codes, Hadamard 
matrices, designs and 
the Golay code 


§1. Nonlinear codes 


"... good codes just might be messy" 
J.L. Massey 


One basic purpose of codes is to correct errors on noisy communication 
channels, and for this purpose linear codes, introduced in Ch. 1, have many 
practical advantages. But if we want to obtain the largest possible number of 
codewords with a given minimum distance, we must sometimes use nonlinear 
codes. 

For example, suppose we want a double-error-correcting code of length 11. 
The largest linear code has 16 codewords, whereas there is a nonlinear code, 
shown in Fig. 2.1, which contains 24 codewords, an increase of 50%. (This is a 
Hadamard code, see $3). 

Our notation for nonlinear codes is given by the following. 


Definition. An (n, M, d) code is a set of M vectors of length n (with 
components from some field F) such that any two vectors differ in at least d 
places, and d is the largest number with this property. 


In this chapter all codes are binary. Note that square brackets denote a 
linear code, while round parentheses are used for a code which may or may 
not be linear. An [n, k, d] binary linear code is an (n,2", d) code. 

We usually assume that there is no coordinate place in which every 
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0 


0123456789 














11111111111 








The first twelve rows form the (11, 12, 6) Hadamard code #2; 
all 24 rows form the (11, 24, 5) Hadamard code 23; 


Fig. 2.1. 


codeword is zero (otherwise it would be an (n — 1, M, d) code). Also, since the 
distances between codewords are unchanged if a constant vector is added to 
all the codewords, we may, if we wish, assume that the code contains the zero 
vector. 

We say that two (n, M, d) binary codes € and 9 are equivalent if one can 
be obtained from the other by permuting the n symbols and adding a constant 
vector, or more formally if there is a permutation 7 and a vector a such that 
D — (mr(u) - a: u € €). If € and 2 are linear this reduces to the definition of 
equivalence given in $7 of Ch: 1. As an example, 
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000 111 
011 100 
101 284 ip 
110 001 


are equivalent codes (take a = 111). . 
[More generally, the equivalence of nonbinary codes is defined as follows. 
Let € and 9 be codes of length n over a field of q elements. Then € and D 


are equivalent if there exist n permutations 7,..., 7, of the q elements and 
a permutation o of the n coordinate positions such that 
if (4,...,u4)€ € then e(m(u),...,m.(4,) € 9. 


In words, the field elements in each coordinate place are permuted and then 
the coordinate places themselves are permuted. If € and 9 are both linear 
only those m; generated by scalar multiples and field automorphisms ($6 of 
Ch. 4) can be used.] 


Definition. If € is an (n, M, d) code, let A; be the number of codewords of 
weight i. The numbers Ao, Ai,..., A. are called the weight distribution of «€. 
Of course Ao A, t ::: +A =M. We have already used the weight dis- 
tribution in calculating error probabilities in Ch. 1. 


If 0 is a codeword, then an observer sitting on 0 would see A; codewords at 
distance i from him. For a linear code the view would be the same from any 
codeword. (Why?) However, in a nonlinear code this need not be true, as 
shown by the code (00,01, 11). (If it is true, the code is said to be distance 
invariant. The Nordstrom- Robinson code ($8) is distance invariant, and other 
examples will be given in Ch. 6.) Therefore for a nonlinear code it is useful to 
consider the average number of codewords at distance i from a fixed 
codeword. 


Definition. The distance distribution of € consists of the numbers 
Bo, Bi,..., Ba, where 


B. = 4 - (number of ordered pairs of codewords 
u, v such that dist (u, v) = i). 


Note that Bo = 1 and Bo B, *::: + B, = M. For linear codes the weight and 
distance distributions coincide. Also a translated code a +€ has the same 
distance distribution as €. 


It is helpful to think of these codes geometrically. A binary vector 
(a,,...,a@,) of length n gives the coordinates of a vértex of a unit cube in n 
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000 
A CUBE SHOWING A 4-DIMENSIONAL CUBE 
A (3,2,3) CODE. OR TESSERACT, SHOWING 
A (4,8,2) CODE. 
Fig. 2.2." 


dimensions. Then an (n, M, d) code is just a subset of these vertices (Fig. 2.2). 
In this geometrical language the coding theory problem is to choose as 
many vertices of the cube as possible while keeping them a certain distance 


apart. 

This is a packing problem, for if the code has minimum distance d, the 
Euclidean distance between codewords is > vyd. So finding an (n, M, d) code 
means finding M nonoverlapping spheres of radius żyd with centers at the 
vertices of the cube. 


Aside: Research Problem 2.1. The analogous problem of placing M points on 
the surface of a unit sphere in n dimensions is also unsolved. In other words, 
where should M misanthropes build houses on the surface of a planet, so as 
to maximize the smallest distance between any two houses? This problem is 
also important for communication theory - it is the problem of designing the 
best signals for transmitting over a band-limited channel. 


$2. The Plotkin bound 


Theorem 1. (The Plotkin bound.) For any (n, M,d) code € for which n « 2d, 
we have 


M <2 d | (1) 


Proof. We shall calculate the sum 


Y, Y dist (u, v) 


uER ves 


in two ways. First, since dist (u, v) >d if uz v, the sum is =M(M — 1)d. On 
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the other hand, let A be the M x n matrix whose rows are the codewords. 
Suppose the i^ column of A contains x, 0’s and M — x; I’s. Then this column 
contributes 2x,(M — x;) to the sum, so that the sum is equal to 


> 2xi( M — Xi). 


If M is even this expression is maximized if all x, — iM, and the sum is 
xinM". Thus we have 





2 
M(M — Dd eS 
or 
2d 
MSc (2) 


But M is even, so this implies 


d 
ae | —n | 
On the other hand if M is odd, the sum is «n(M^-— 1)/2, and instead of (2) we 
get 


n 2d — 
2d-n 2d-n 


M « [24] 1«2[ 555], 


using [2x] «2[x] * 1. Q.E.D. 








Mx 1 


This implies 





Example. The top half of Fig. 2.1 forms an (11, 12, 6) code (#2 in the notation 
of $3) for which equality holds in (1). 


Problems. (1) Does there exist a (16, 10, 9) code? 
(2) Construct a (16, 32, 8) code. [Hint: from the proof of Theorem 1, each 
column must contain half 0’s and half 1’s.] 


Let A(n, d) denote the largest number M of codewords in any (n, M, d) 
code. The next theorem shows it is enough to find A(n, d) for even values 
of d: 


Theorem 2. 
A(n,2r — 1) - A(n - 1,2r) 
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Proof. Let € be an (n, M,2r — 1) code. By adding an overall parity check 
(Ch. 1, $9), we get an (n + 1, M,2r) code, thus 


A(n,2r — 1) « A(n * 1,2r). 


Conversely, given an (n * 1, M,2r) code, deleting one coordinate gives an 
(n, M, d 22r — 1) code, thus 


A(n,2r — 1) z A(n + L,2r). Q.E.D. 


Theorem 3. 
A(n, d) x 2A(n — 1, d). 


Proof. Given an (n, M, d) code, divide the codewords into two classes, those 
beginning with 0 and those beginning with 1. One class must contain at least 
half of the codewords, thus 


A(n — 1, d) > iA(n, d). Q.E.D. 


Corollary 4. (The Plotkin bound.) If d is even and 2d > n, then 





d 
A(n, d) «2|, 5— 6) 
A(2d, d) « 4d. (4) 
If d is odd and 2d + 1 >n, then 
d+] 
A(n,d)x qt |. (5) 
A(2d +1,d)<4d +4. (6) 
Proof. To prove (4), we have from Theorem 3 and Equation (1) 
A(4r, 2r) = 2A(4r — 1, 2r) < 8r. 
If d is odd, then by Theorem 2 
d4l 
A(n,d)- A(n+1,d+ n«2[., 55]. 
Equation (6) follows similarly. Q.E.D. 


If Hadamard matrices exist of all possible orders (which has not yet been 
proved), then in fact equality holds in Equations (3)-(6). Thus the Plotkin 
bound is tight, in the sense that there exist codes which meet this bound. This 
is Levenshtein's theorem, which is proved in the next section. 
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§3. Hadamard matrices and Hadamard codes 


Definition. A Hadamard matrix H of order n is an n Xn matrix of +1’s and 
— ['s such that 


HH’ - nl. (7) 


In other words the real inner product of any two distinct rows of H is zero, 
i.e., distinct rows are orthogonal, and the real inner product of any row with 
itself is n. Since H` = (I/n) H', we also have H"H = nl, thus the columns 
have the same properties. 


It is easy to see that multiplying any row or column by —1 changes H into 
another Hadamard matrix. By this means we can change the first row and 
column of H into +1’s. Such a Hadamard matrix is called normalized. 

Normalized Hadamard matrices of orders 1,2,4,8 are shown in Fig. 2.3 
where we have written — instead of — 1, a convention we use throughout the 
book. 


n=1: H,-() n=} H-( 


à 
— 
| 
| 


| 
—_ 


l 

l 

l 

l 
1111--- 
l = 
11--- 
1--1-11- 


Fig. 2.3. Sylvester-type Hadamard matrices of order 1, 2, 4 and 8. 


Theorem 5. If a Hadamard matrix H of order n exists, then n is 1,2 or a 
multiple of 4. 


Proof. Without loss of generality we may suppose H is normalized. Suppose 
n = 3 and the composition of the first three rows of H is as follows: 


H0 i EEES US E EEES del 
ll---1 Ilse] mme Lmeee 


ll---1 Rum es eee. ll---1 m 
— ——— peces ae e 


i i k l 


Ch. 2. §3. Hadamard matrices and Hadamard codes 45 


Since the rows are orthogonal, we have 


i+j-k-l=0, 
i-j+k-Il=0, 
i-j-k+l=0, 
which imply i =j =k = l, thus n = 4i is a multiple of 4. Q.E.D. 


It is conjectured that Hadamard matrices exist whenever the order is a 
multiple of 4, although this has not yet been proved. A large number of 
constructions are known, and the smallest order for which a Hadamard matrix 
has not been constructed is (in 1977) 268. We give two constructions which 
are important for coding theory. 


Construction I. If H, is a Hadamard matrix of order n, then it is easy to verify 
that 
H, a) 
Hoan = ui 
Y [m —H, 


is a Hadamard matrix of order 2n. Starting with H, = (1), this gives H:, H4, Hs 
(as shown in Fig. 2.3),... and so Hadamard matrices of all orders which are 
powers of two. These are called Sylvester matrices. 


For the second construction we need some facts about quadratic residues. 
Quadratic residues 
Definition. Let p be an odd prime. The nonzero squares modulo p, i.e., the 
numbers 1,27, 35,... reduced mod p, are called the quadratic residues mod p, 
or simply the residues mod p. 

To find the residues mod p it is enough to consider the squares of the 


numbers from 1 to p — 1. In fact since (p — ay =(—a) =a‘ (mod p), it is 
enough to consider the squares 





1225 s (E) (mod p). 


These are all distinct, for if i? 2 j^ (mod p), with 1 si, j «:x(p—1) then p 
divides (i — j)(i+j), which is only possible if i = j. 

Therefore there are :(p —1) quadratic residues mod p. The i(p- 1) re- 
maining numbers mod p are called nonresidues. Zero is neither a residue nor a 
nonresidue. 

For example if p — 11, the quadratic residues mod 11 are 


P=1, 2? = 4, 3 — 9, 4=16=5, and $ = 25=3 
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mod 11, i.e. 1, 3, 4, 5, and 9. The remaining numbers 2, 6, 7, 8 and 10 are the 
nonresidues. (The reader is reminded that 


A = B (modulo C), 
pronounced “A is congruent to B mod C”, means that 
C|A- B 
where the vertical slash means "divides," or equivalently 
A — B is a multiple of C. 


Then A and B are in the same residue class mod C.] 

We now state some properties of quadratic residues. 

(Q1) The product of two quadratic residues or of two nonresidues is a 
quadratic residue, and the product of a quadratic residue and a nonresidue is a 
nonresidue. The proof is left as an exercise. 

(Q2) If p is of the form 4k + 1, —1 is a quadratic residue mod p. If p is of 
the form 4k +3, —1 is a nonresidue mod p. 

The proof is postponed until Ch. 4. 

(Q3) Let p be an odd prime. The function y, called the Legendre symbol, is 
defined on the integers by 


xi) =0 if i is a multiple of p, 
x@=1 if the remainder when i is 


divided by p is a quadratic 
residue mod p, and 


x() 7-1 if the remainder is a nonresidue. 


Theorem 6. For any c#0 (mod p), 


È xxt +c)=-l. 


Proof. From (Q1) it follows that 
xGy) 7 xG)x(y) forüx,y «p - 1. 
The term b — 0 contributes zero to the sum. Now suppose b#0 and let 
z =(b + cy b(inod p). There is a unique integer z, 0 x z « p — 1, for each b. As 
b runs from 1 to p — 1, z takes on the values 0,2,..., p — 1 but not 1. Then 
pt} =! 
È x(b)x(b + e) 2, x(b)x(b2) 


pot 


1 
M 


x(YxG) 


oc 
boa 


! 
M 


x(z)=0-x(1)=-1. Q.E.D. 


NN 


* d 
-o 
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Remark. Very similar properties hold when p is replaced by any odd prime 
power p", the numbers 0,1,...,p —1 are replaced by the elements of the 
finite field GF(p") (see Ch. 3), and the quadratic residues are defined to be the 
nonzero squares in this field. 


Construction II: The Paley construction. This construction gives a Hadamard 
matrix of any order n = p +1 which is a multiple of 4 (or of order n = p" +1 
if we use quadratic residues in GF(p")). 

We first construct the Jacobsthal matrix Q = (q;). This is a p X p matrix 
whose rows and columns are labeled 0, 1, ..., p — 1, and q; = x(j — i). See Fig. 
2.4 for the case p - 7. 


0123456 
0/011-1-- 
1|-011-1- 
2|--011-1 

Q-3,|1--011- 
4|-1--011 
5; 1-1--01 
6 -1--0 


Fig. 2.4. A Jacobsthal matrix. 


Note that q; = x(i —j) = x(- DxG — i) = — q; since p is of the form 4k — 1 (see 
property (Q2)), and so Q is skew-symmetric, i.e., Q' = — Q. 


Lemma 7. QQ' = pI —J, and QJ = JQ — 0, where J is the matrix all of whose 
entries are 1. 
Proof. Let P — (pi) = QQ". Then 
pri 
Di = p? qx=p-l, 
k=0 
=i p-i 
Pi = x qiqi = 2, x(k -i)x(k —j) for ix j, 
ak 
23 x(b)x(b +c) where b=k—i,c=i-j, 
b=0 


=—] by property (Q3). 


Also QJ = JQ = 0 since each row and column of Q contains x(p — 1) +1’s and 
Xp — 1) -V's. Q.E.D. 
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Now let 


a(t oli} ® 


Then 


HH? = (jr oci osa 1*(-1xo - n) 


But from Lemma 7 J * (Q— D(Q' - 1) 2J «pl -J—-Q-Q"' «I —-(p * DI, so 
HH" = (p + I)i». Thus H is a normalized Hadamard matrix of order p +1, 
which is said to be of Paley type. 


Example. Fig. 2.5 shows the Hadamard matrices of orders 8 and 12 obtained 
in this way. 


l 

— 

l 

— 
— — -—- 


[111 


- 
— -— 
— 
— 


H; = 











—^———— 
— 
l 
l 
l 


——-—- 


l 
l 
l 
l 
l 

- l 

1 - l 

l l 
l 2 
l 1---1--1- 
l tet 
1 l 


Fig. 2.5. Hadamard matrices of orders 8 and 12. 


[= 


Constructions I and II together give Hadamard matrices of all orders 
1,2,4,8,12,..., 32. 

Let us call two Hadamard matrices equivalent if one can be obtained from the 
other by permuting rows and columns and multiplying rows and columns by — 1. 
Then it is easy to see that there is only one equivalence class of Hadamard 
matrices of orders 1,2 and 4. We shall see in Ch. 20 that there is only one class of 
order 8 (so the Hadamard matrices of order 8 shown in Figs. 2.3 and 2.5 are 
equivalent) and one class of order 12. Furthermore it is known that there are five 
classes of order 16, and 3 of order 20. The number of order 24 is unknown. 


Problem. (3) If n = 2”, let us,..., Un denote the distinct binary m-tuples. Show 
that the matrix H = (h,) where h; ^ (—1)^'^ is a Hadamard matrix of order n 
which is equivalent to that obtained from Construction I. 
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Hadamard Codes. Let H, be a normalized Hadamard matrix of order n. If 
+1’s are replaced by 0’s and — lI's by I's, H, is changed into the binary 
Hadamard matrix A,. Since the rows of H, are orthogonal, any two rows of 
A, agree in in places and differ in $n places, and so have Hamming distance $n 
apart. 

A, gives rise to three Hadamard codes: 

(i) an (n — 1, n, $n) code. s£,, consisting of the rows of A, with the first 
column deleted (.%, is shown in Fig. 2.7 and 5£, is the top half of Fig. 2.1 if the 
rows are read backwards); 

(ii) an (n—1,2n,;n—]) code. &,. consisting of «£, together with the 
complements of all its codewords (29, is shown in Fig. 2.1); and 

(ii) an (n, 2n, 3n) code, €,. consisting of the rows of A, and their comple- 
ments. 

Á, is a simplex code, since the distance between two codewords is 3n (see 
Ch. 1, $9). In fact these three codes are nonlinear generalizations of the codes 
in Fig. 1.13. Furthermore if H,, with n = 2', is obtained from Construction I, 
these are linear codes and are the same as the codes shown in Fig. 1.13. 

On the other hand if H, is obtained from Construction II, the resulting 
codes are nonlinear for n > 8. We get linear codes by taking the linear span of 
these codes; these are called quadratic residue codes, and will be studied in 
Ch. 16. But if H, is not obtained from Constructions I or II, little appears to 
be known about the binary rank of A,. 


Problem. (4) Show that if n is a multiple of 8, A, ATI =0(mod 2), and hence the 
binary rank of .A, is xn. [Hint: use the fact that if B and C are n xn 
matrices over any field then rank (B) + rank (C) « n * rank (BC) (see Theorem 
3.11 of Marcus [914]).] 


Levenshtein's theorem. In this section we prove Levenshtein's theorem, 
which says that codes exist which meet the Plotkin bound. 

First a couple of simple constructions. 

(i) The codewords in 5£, which begin with 0 form an (n — 2. n/2, n/2) code 
wo), if the initial zero is deleted (£i, is shown in Fig. 2.7). 

(ii) Suppose we have an (n;, Mi, d) code €, and an (n2, M2, d.) code €. 
We paste a copies of €, side by side, followed by b copies of €. (Fig. 2.6). 





Fig. 2.6. Forming a new code by pasting together copies of two codes. 
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Now omit the last M;— M, rows of €. (if M, < M;), or omit the last M, — M; 
rows of €, (if M, ^ M;). The result is an (n, M, d) code with n = an; + bn;, 
M = min (M,, M3}, and d > ad, + bd., for any values of a and b. We denote 
this code by a6, @ b €.. 


Theorem 8. (Levenshtein.) Provided enough Hadamard matrices exist, 
equality holds in the Plotkin bound (3)-(6). Thus if d is even, 





^ d í 
At, d) = 2] 552] if 2d>n=d, (9) 
A(2d, d) = 4d; (10) 
and if d is odd. 
E d+1 : 
At, d) - 2|. — | if 2d+1>n 2d, (11) 
A(2d+1,d)=4d +4. (12) 


Proof. (11), (12) follow from (9), (10) using Theorem 2, so we may assume that 
d is even. 
The Hadamard code €;, above is a (2d, 4d, d) code, which establishes (10). 
To prove (9) we shall construct an (n, M, d) code with M = 2[d/(2d — n)] 
for any n and even d satisfying 2d > n = d. We need the following simple 
result, whose proof is left to the reader. If 2d > n = d, define k = [d/(2d — n)], 
and 
a = d2k - 1D) - n(k +1), b = kn — dk — 1). (13) 


Then a and b are nonnegative integers and 


n — Qk —lat+(2k+1)b 
d — ka * (k + 1)b. 


If n is even then so are a and b (from (13)). If n is odd and k even, then b is 
even. If n and k are odd, then a is even. 
Now consider the code €, where: 


if n even, € =$ s 5 alu 
if n odd, k even, € = asl 7 afi. 
if n and k odd, € = 5 sf, @ badass 
Then in each case it is clear from remark (ii) above that € has length n, minimum 


distance d, and contains 2k = 2[d/(2d — n)] codewords. The existence of this 
code establishes (9). Q.E.D. 
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Note that the proof of (10) requires a Hadamard matrix of order 2d and 
that Hadamard matrices of orders 2k (if k even), 2k +2 (if k odd), 4k, 4k +4 
are sufficient for the proof of (9). 


Example. We illustrate Levenshtein's method by constructing a (27, 6, 16) 
code. Referring to the proof of the theorem, we find k =3, a =4, b — 1. This 
is the case n and k odd, so the code is obtained by pasting together two 
copies of sf), and one copy of £s. 

a, and £i are obtained from Hs and Hi; in Fig. 2.4, and are shown in Fig. 2.7. 


0000000] 

1001011 0000000000 

1100101 1101000111 
4,411 110010 Q0 1]1E110110100 
5710111001 7710111011010 

1011100 0011101101 

0101110 1000111011 

001011! 





L 


Fig. 2.7. The (7,8, 4) code s£, and the (10,6, 6) code i. 


The resulting (27,6, 16) code is shown in Fig. 2.8. 


0000000000/0000000000/0000000 
1101000111/,1101000111/100101 I 
1110110100/1110110100/1100101 
OLLIOTLIOLOSOILLTIOILOITO/I110010 
06011101101;0011101101/|0111001 
1000111011!/10001n11011:/1011100 


Fig. 2.8. A (27,6, 16) code illustrating Levenshtein's construction. 


Problem. (5) Construct (28,8, 16) and (32,8, 18) codes. 


Other applications of Hadamard matrices. 
(1) Maximal determinants. If H is a Hadamard matrix of order n, taking 
determinants of (7) we get 


det (HH ") = det H - det H" = (det Hy = n^, 
so H has determinant +n”?. An important theorem of Hadamard [572] states 


that if A = (aj) is any real n x n matrix with —1 <a, « 1, then 


ldet A| € n”. 
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Furthermore equality holds iff a Hadamard matrix of order n exists (and so n 
must be a multiple of 4). (See also Bellman [100, p. 127]. Hardy et al. [601], p. 
34], Ryser [1136, p. 105].) If n is not a multiple of 4, less is known about the 
largest possible determinant - see Problem 8. 

(2) Weighing designs. By weighing several objects together instead of 
weighing them separately it is sometimes possible to determine the individual 
weights more accurately. Techniques for doing this are called weighing 
designs, and the best ones are based on Hadamard matrices. 

These techniques are applicable to a great variety of problems of 
measurement, not only of weights, but of lengths, voltages, resistances, 
concentrations of chemicals, frequency spectra (Decker [341], Gibbs and 
Gebbie [478], Golay [507, 508], Harwit et al. [622]. Phillips et al. [1042], Sloane 
et al. [1234-1236]), in fact to any experiment where the measure of several 
objects is the sum of the individual measurements. For simplicity, however, 
we shall just describe the weighing problem. 

Suppose four light objects are to be weighed, using a balance with two 
pans which makes an error e each time it is used. Assume that e is a random 
variable with mean zero and variance c^, which is independent of the amount 
being weighed. 

First, suppose the objects are weighed separately. If the unknown weights 
are a, b, c, d, the measurements are vi, v», Ys, ya, and the (unknown) errors 
made by the balance are e€, €2,€:,€4, then the four weighings give four 
equations: 


aw té, b=y2.+e2, C=yte, d = ya + €. 


The estimates of the unknown weights a, b, ... are 


=y =a E, b=y2=b-€:2,..., 


D 


each with variance a”. 

On the other hand, suppose the four weighings are made as follows: 
at+b+ct+d=yite, 
a-b+c-d=y.+6, 
a+b-c-d=y +6, 
a-b-c+d=Vvatéa. 


(14) 


This means that in the first weighing all four objects are placed in the left 
hand pan of the balance, and in the other weighings two objects are in the left 
pan and two in the right. Since the coefficient matrix on the left is a Hadamard 
matrix, it is easy to solve for a,b, c, d. Thus the estimate for a is 


x Yi + yat Ys + ya 
U Ee 
4 
_eitextertes 
eI E 3 
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The variance of ce, where c is a constant, is c? times the variance of e, and 
the variance of a sum of independent random variables is the sum of the 
individual variances. Therefore the variance of â is 4-(o7/16) ^ a7/4, an 
improvement by a factor of 4. Similarly for the other unknowns. 

In general if n objects are to be weighed in n weighings, and a Hadamard 
matrix of order n exists, this technique reduces the variance from c? to o /n. 
This is known to be the smallest that can be attained with any choice of signs 
on the left side of (14). For a proof of this, and for much more information 
about weighing designs, see Banerjee [65,66], Geramita et al. [473-477], 
Hotelling [665], Mood [972], Payne [1032] and Sloane and Harwit [1236]. 

Now suppose the balance has only one scale pan, so only coefficients 0 and 
1 can be used. In this case the variance of the estimates can also be reduced, 
though not by such a large factor. Again we illustrate the technique by an 
example. If seven objects a, b, c,...,g are to be weighed, use the following 
weighing design: 

a tc te tg-7^y tei 


b+c +f+g=y+e 

a+b +e+f =y, +6; 
dt+e+ft+g=ystes (15) 

a +c+d +f = ys + €; 

b+c+d+e = Ye + €« 

a+b +d +g =y: +e 


The coefficients are determined in an obvious way from the Hadamard matrix 
H; of Fig. 2.3. Then the estimate for a is (cf. Problem 7) 


Yi — Y2 + Ya — Yat Ys — Yot Yr 
4 





a= 


e €i- €x tes—e€4 t €5 — Eg t € 
=aQ- 4 , 





which has variance 707/16, and similarly for the other weights. In general if 
there are n objects and a Hadamard matrix of order n+1 exists, this 
technique reduces the variance to 4no?/(n + 1)’, which is in some sense the 
best possible - see Mood [972], Raghavarao [1085, Ch. 17], and Sloane and 
Harwit [1236]. 

We can describe these techniques by saying that we have used a Hadamard 
code to encode, or transform, the data before measuring it. 

Another type of weighing problem, also related to coding theory, will be 
discussed in Ch. 6. 

(3) The Hadamard Transform. Let H be a Hadamard matrix of order n. If 
X —(x,...,X.) Is a real vector, its Hadamard transform (or Walsh, or 
discrete Fourier transform) is the vector 


X = XH. 
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Let F” denote the set of all binary m-tuples, The entries (—1)“'”, for 
u,v € F”, form a Hadamard matrix of order n = 2” (Problem 3), If f is a 
mapping defined on F", its Hadamard transform f is given by 
fw= > (Df) ueF". 
ver™ 

We shall use f in an important lemma (Lemma 2) in Ch. 5, and in studying 
Reed-Muller codes. Hadamard transforms are also widely used in com- 
munications and physics, and have an extensive and widely scattered 
literature - see for example Ahmed et al. [12-15], Harmuth [603—604, 680], 
Kennett [757], Pratt et al. [1079], Shanks [1187] and Wadbrook and Wollons 
[1375]. Analysis of the effect of errors on the transform involves detailed 
knowledge of the cosets of first order Reed-Muller codes- see Berlekamp 
[119]. 


Problems. (6) Let 


1 1 
ais (i S 
be a normalized Hadamard matrix of order n. Show that SS’ — nI — J, 
SJ -JS--J,S$' -(I/nYS' - J). 
(7) Let T =5(J — S) (so that T is the coefficient matrix in (15)). Show that 
T'-2-Qln)S'. 
(8) Maximum determinant problems. Suppose A = (aj) is a real matrix of 
order n. Let 
f(n) = max |det (A)| subject to a; — O or 1, 
g(n) = max |det (A)| subject to a; = —1 or 1, 
h(n) = max |det (A)| subject to a; = — 1, 0, or 1, 
F(n) = max |det (A)| subject to O<a,<1, 
G(n) = max |det (A)| subject to —1« a; s 1. 


Show 
(a) f(n) = F(n), 


(b) g(n) = h(n) = G(n), 
(c) g(n) = 2" f(n — 1). 


Thus all five problems are equivalent! Hadamard [572] proved that f(n)< 

27" (n +1)%*??, with equality iff a Hadamard matrix of order n + 1 exists. The 
first few values of f are as follows: 

n 12 

fn) 11 


See for example Brenner and Cummings [196], Cohn [298], Ehlich [403], 
Ehlich and Zeller [404] and Yang [1445]. 


34567 8 9 10 lIl 12 13 
2359 32 56 144 320 1458 3645 9477 





Ch. 2. $4. Conference matrices 55 


$4. Conference Matrices 


These are similar to Hadamard matrices but with a slightly different 
defining equation. They also give rise to good nonlinear codes. Our treatment 
will be brief. 


Definition. A conference matrix C of order n is an n x n matrix with diagonal 
entries 0 and other entries +1 or —1, which satisfies 


CC! » (n- DI. (16) 
(The name arises from the use of such matrices in the design of networks 


having the same attenuation between every pair of terminals.) These matrices 
are sometimes called C-matrices. 


Properties. (1) As with Hadamard matrices we can normalize C by multiplying 
rows and columns by —1, so that C has the form 


0 1 
where S is a square matrix of order n — 1 satisfying 
SS’ -(n- 11 - J, SJ =JS =0. (18) 


For example, normalized conference matrices of orders 2 and 4 are shown 
in Fig. 2.9 


Fig. 2.9. Conference matrices of orders 2 and 4. 


(2) If C exists then n must be even. If n =2 (mod 4) then C can be made 
symmetric by multiplying rows and columns by —1, while if n =0 (mod 4) 
then C can be made skew-symmetric. 

(3) Conversely if a symmetric C exists then n = 2 (mod 4) and n — 1 can be 
written as 


n-l-qa-b 
where a and b are integers. If a skew-symmetric C exists then n = 2 orn =0 


(mod 4). (For the proof of these properties see Delsarte et al. [366] and 
Belevitch [97].) 
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Several constructions are known. The most useful for our purpose is the 
following 


Construction. (Paley.) Let n = p" - 1-2 (mod 4), where p is an odd prime. 
As in 33 we define the p" x p" Jacobsthal matrix Q = (qi) where qy = x(j — i). 
Now p" = | (mod 4), so Q is symmetric by property (Q2). Then 


cs (i o) 


is a symmetric conference matrix of order n. 


Examples. (i) n = 6, p" = 5. The quadratic residues mod 5 are | and 4, and the 
construction gives the first matrix shown in Fig. 2.10. 

(ii) n = 10, p" = 3. Let the elements of the field GF(3’) defined by a^^ a + 
2=0 be 0, 1,2, a, a - 1l, a -2, 2a, 2a +1, 2a +2 (see Ch. 4). The quadratic 
residues in this field are 1. 2, a +2, 2a +1. Then the construction gives the 
second matrix shown in Fig. 2.10. 




















0 
c=! 
3 
4 
012 2a 2a +1 2a +2 
[0 EXEZE IE seid l ] l l 
(redet e Iil- ıl 2 
Pt FO ke e 2 ae. ou 
2jıļı roj- | z je => 
C all[-1-[|0 1 Ducum = 
oe att} l|--1|1 0 | l i 5 
ge a a Te l 0 NE = 
Sad quee -|o 1 l 
a Tb pem qe os l 1 0 l 
Ja +2 {1 Bop «es E l 1 0 








Fig. 2.10. Symmetric conference matrices of orders 6 and 10. 


The above construction gives symmetric conference matrices of orders 
6, 10, 14, 18, 26, 30, 38, 42,50,.... Orders 22, 34,58,... do not exist from 
Property 3. A recent construction of Mathon [1477] gives matrices of order 46, 
442,.... 


Ch. 2. $4. Conference matrices 57 


Codes from conference matrices. Let C, be a symmetric conference matrix of 
order n (so n =2 (mod 4)), and let S be as in Equation (17). Then the rows of 
XS --I-cJ) (-S+I1+J), plus the zero and all-ones vectors form an 


(n—1.2n, n - 2) 


nonlinear conference matrix code. That the minimum distance is Xn —2) 
follows easily from Equation (18). 


Example. The conference matrix Cw» in Fig. 2.10 gives the (9,20, 4) code ©, 
shown in Fig. 2.11. 


000 000 000 


111 001 010 
111 100 001 
111 010 100 


010 111 001 
001 111 100 
100 111 010 


001 010 111 
100 001 111 
010 100 111 


100 110 101 
010 011 110 
001 101 011 


101 100 110 
110 010 O11 
011 001 101 


110 101 100 
011 110 010 
101 011 001 


111 111 111 


Fig. 2.11. A (9. 20. 4) conference matrix code .. 


(The zero codeword could be changed to a vector of weight | without 
decreasing the minimum distance. Therefore there are at least two 
inequivalent codes with parameters (9, 20, 4).) 

This important code was first found by Golay [509]. using a different 
method. We shall give two other constructions for (9, 20, 4) codes in $7. In $8 
the code of Fig. 2.11 will be used to generate infinitely many nonlinear 
single-error-correcting codes. 
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The first few conference matrix codes are the following: 


(5, 12, 2), (9, 20, 4), (13, 28, 6), (17. 36, 8), 
(25, 52, 12), (29, 60, 14), (37, 76, 18), (41, 84, 20). 


Research Problem 2.2. Since n is greater than 2d the Plotkin bound does not 
apply to these codes. However we will see in Ch. 17 that the (9, 20, 4) code is 
optimal in the sense of having the largest number of codewords for this length 
and distance. (Thus A(9, 4) = 20.) We conjecture that all of these codes except 
the first and third are similarly optimal. (There exists a (5, 16,2) linear code 
and a (13, 32, 6) nonlinear code - see $8.) 


$5. t-designs 


Definition. Let X be a v-set (i.e. a set with v elements), whose elements are 
called points or sometimes (for historical reasons) varieties. A t-design is a 
collection of distinct k-subsets (called blocks) of X with the property that any 
t-subset of X is contained in exactly A blocks. In more picturesque language, 
it is a collection of committees chosen out of v people, each committee 
containing k persons, and such that any t persons serve together on exactly A 
committees. We call this a t-(v, k, A) design*. t-designs are also sometimes 
called tactical configurations. 


Example. The seven points and seven lines (one of which is curved) of Fig. 
2.12 form the projective plane of order 2. If we take the lines as blocks, this is 
a 2-(7, 3, 1) design, since there is a unique line through any two of the seven 
points. The seven blocks are 


013, 124, 235, 346, 450, 561, 602. 


A 2-design is called a balanced incomplete block design, and the ter- 
minology of the subject comes from the original application of such designs in 
agricultural or biological experiments. For example suppose v varieties of 
fertilizer are to be compared in their effect on b different crops. Ideally each 
crop would be tested with each variety of fertilizer, giving b blocks of land 
(one for each crop) each of size v. This is a 2-(v, v, b) design known as a 
complete design. However for reasons of economy this is impossible, and we 
seek a design where each crop is tested with only k of the varieties of 
fertilizer (so each block now has size k), and where any two fertilizers are 
used together on the same crop a constant number A of times. Thus the design 
is balanced so far as comparisons between pairs of fertilizers are concerned. 


*Some authors put the letters in a different order. 
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(6) 


A 
A | > 
AT 


Fig. 2.12. The projective plane of order 2. 


This is a 2-(v,k,A) design with b blocks, and is incomplete if k « v. 
Some other t-designs also have names of their own. 


Definition. A Steiner system is a t-design with A = 1, and a t-(v, k, 1) design is 
usually called an S(t, k, v). Thus the example of Fig. 2.12 is an S(2, 3, 7). 


Definition. A projective plane of order n is an S(2, n - 1, n?* n + 1) with n 72. 
Thats why Fig. 2.12 is a projective plane of order 2. (For details see 
Appendix B.) 


Definition. An affine plane of order n is an S, n, n?) with n 22. 


Theorem 9. In a t-(v, k, A) design, let P,,..., P, be any t distinct points, let A, 
be the number of blocks containing P,,..., P, for 1 «it, and let Ao = b be 
the total number of blocks. Then A, is independent of the choice of P,,..., Pi, 
and in fact 


EG for 0Oxis t, 
t-i 
(v—-i(v-i-D:--(v-t«l) 


DM eT) ee ee ers l (19) 


This implies that a t-(v, k, A) design is also an i-(v, k, 4,) design for | «ist. 
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Proof. The result is true for i = t by definition of a t-design, since any t points 
are contained in exactly A blocks. We proceed by induction on i. Suppose we 
have already shown that A;., is independent of the choice of P,,..., Pis. For 
each block B containing P,,...,P, and for each point Q distinct from 
P,,...,P, define «(Q, B) - lif QE B, =0 if QÉ B. Then from the induction 
hypothesis 

EXxQB-A0-i) 

Q B 


= È xQ B)=A(k—i), (20) 


which proves that A, is independent of the choice of P,,...,P;, and es- 
tablishes (19). Q.E.D. 


Corollary 10. /n a t-(v, k, A) design the total number of blocks is 


(i) 


b-—— (21) 


and each point belongs to exactly r blocks, where 
bk = or. (22) 
Furthermore in a 2-design, 
A(v — 1) = r(k — 1). (23) 


Proof. b = Av is given by (19). Since r=A,, (22) and (23) follow from (20). 
Q.E.D. 


If b = v, and hence r= k, the design is called symmetric. 


Corollary 11. A necessary condition for a t-(v, k, A) design to exist is that the 
numbers A( 1/01) be integers for 0x i st. 


In some cases, e.g., for Steiner systems S(2, 3, v), S(2, 4, v), S(2, 5, v) and 
S(3,4, v), this condition is also sufficient, but not always. For example we 
shall see in Ch. 19 that an S(2, 7, 43) or projective plane of order 6 does not 
exist, even though the above condition is satisfied. 


Research Problem 2.3. It has been conjectured that for any given t, k and A, if 
the conditions of Corollary 11 are satisfied and v is sufficiently large, then a 
t-(v, k, A) design exists. But so far this has only been proved for t —2, see 
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Wilson [1420-1422], and in fact no designs are presently known with ft > 5, 
except trivial ones. 


* Definition. In a t-(v, k, A) design, let P,,..., P, be the points belonging to one 
of the blocks. Consider the blocks which contain P,,..., P, but do not 
contain Pj.,,.... P, for0Oj <i. (For j = 0 we consider the blocks which do 
not contain P,,..., P, and for j =i we consider the blocks which contain 
P,,..., Pj). If the number of such blocks is a constant, independent of the 
choice of P,,..., P, we denote it by Ay. The A; are called block intersection 
numbers. 


Theorem 12. The A, are well defined for i € t. In fact Aj =A, for i & t, with 
A, =A, and are given by Theorem 9. Also the A, satisfy the Pascal property 


Aig = Aisig t Ate ayes 


whenever they are defined. Finally, if the design is a Steiner system so that 
A = 1, then Au = Asir = °° = Aw = l, and the A, are therefore defined for all 


O<jsisk 


Proof. The Pascal property holds because A, is equal to the number of blocks 
which contain P,,..., P; but do not contain P,,.,..., Pı. These blocks can 
be divided into those which contain P;.,, 1.€., Aisij+:, and those which do not, 
Le., Ajai; That the A; are defined for i «t is an immediate consequence of 
Theorem 9. Indeed, both the total number of blocks, Ao = Ào, and the number 
of blocks through P,, A; = An, are independent of the choice of P,, hence the 
number of blocks not containing P, Is Aio = Ào — Ai, and is also constant, and 
so on. The last sentence of the theorem is obvious. Q.E.D. 


Thus for any t-(v, k, A) design we may form the “Pascal triangle" of its 


block intersection numbers: 
Àoo = Ao 


Ato An= À; 
Axo An An= Àa 


E.g. for the S(2, 3,7) of Fig. 2.12, we obtain the triangle 
7 
43 
221 
0201 


*The remainder of this section can be omitted on the first reading. 
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From a given t-(v, k, A) design with block intersection numbers Ay we can 
obtain several other designs. Let @ be the set of b blocks of the original 
design. 

Suppose we omit one of the points, say P, from all the blocks of @, and 
consider the two sets of blocks which remain: the first, @, say,-consists of the 
Aso blocks of k points that did not contain P in the first place, and the second, 
B- say, consists of A. blocks of k — 1 points. 


Theorem 13. The blocks B, form a (t —1) —(v — l, k, A) design with block 
intersection numbers AS? = Aia ,. The blocks 8, form a (t - 1) - (v 1l, k—1,A) 
design with block intersection numbers Af? = Anja. These are called derived 
designs. 


The proof is left to the reader. 
Corollary 14. If a Steiner system S(t, k, v) exists so does an S(t — |, k — 1, v — 1). 


If v -k zt, taking the complements of all the blocks in @ gives the 
complementary design, which is a t-(v,v — k, Aro) design with block inter- 
section numbers Aj? = Ai. 


Incidence Matrix. Given a t-(v, k,A) design with v points P;,..., P, and b 
blocks B,,..., By, its b x v incidence matrix A = (ay) is defined by 


a, 7 [5 if P, € B, 
7" (0 if PB. 


For example the incidence matrix of the design of Fig. 2.12 is 


/ 


, Q4) 


> 
! 
—o-oococ- 


a orococ--=- 
-000mm Oo 


(which is a matrix the reader should recognize from §3). 


Problem. (9) If t 22, show that 


(1) A'A — (r-A31 +A, 
(2) det (ATA) = (r — A3 (42 — A2 r), 
(3) if b>1, then b zv (Fisher's inequality). 
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Codes and Designs. To every block in a Steiner system S(t, k, v) corresponds 
a row of the incidence matrix A. If we think of these rows as codewords, the 
Steiner system forms a nonlinear code with parameters 


(n=o,M=b=(?)/(K) a= 2-1 +0)). 


For two blocks cannot have more than t — 1 points in common (or else there 
would be t points contained in two blocks, a contradiction), and therefore the 
Hamming distance between two blocks is at least 2(k — t 1). Note that every 
codeword has weight k: this is a constant weight code. 

One can also consider the linear code generated by the blocks, but little 
seems to be known in general about such codes. (There are a few results for 
t = 2, concerning the codes generated by affine and projective planes, as we 
shall see in Ch. 13.) 

Conversely, given an (n, M,d) code, one can sometimes obtain a design 
from the codewords of a fixed weight. For example: 


Theorem 15. Let Xn bean [n - 27 1, k =2" —1-— m, d =3] Hamming code 
(Ch. 1, 87), and let ¥,, be obtained by adding an overall parity check to Hn. 
Then the codewords of weight 3 in #,, form a Steiner system S(2,3,2" — 1), 
and the codewords of weight 4 in %,, form an S(3.4,2"). 


Definition. A vector v covers a vector u if the ones in u are a subset of the 
ones in v; for example 1001 covers 1001, 1000, 0001, and 0000. Equivalent 


statements are 
u*v-u, 


wt (u * v) 7 wt(v) - wt(u). 


Proof of Theorem 15. The first statement follows from the second using 
Corollary 14. To prove the second statement, let the coordinates of 2, be 
labelled Po, Pin... P,, where P, is the overall parity check. Let u be an 
arbitrary vector of weight 3, with l's at coordinates P,. P, P; say, with 
h<i<j. We must show that there is a unique codeword of weight 4 in X, 
which covers u. Certainly there cannot be two such codewords in %,,, or their 
distance apart would be 2, a contradiction. Case (i), j <n. Since 26, is a 
perfect single-error-correcting code, it contains a codeword c at distance at 
most 1 from u. Either c = u, in which case the extended codeword é =|c|1| 
has weight 4, is in %,, and covers u, or else c has weight 4 and é =|c|0| 
works. Case (ii), j = n. The vector u' with I’s in coordinates P,, P; is covered 
by a unique codeword c € X, with weight 3. and ĉ =] c | 1 | covers u. Q.E.D. 
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Corollary 16. The number of codewords of weight 3 in Hm is 


Ery- 


and the number of weight 4 in %,, is 


oo ze -n 


The codewords of weight 3 in 26, can be identified with the lines of the 
projective geometry PG(m — 1,2), as we shall see in Ch. 13. 





Problem. (10) Find the codewords of weight 3 in the Hamming code of length 
7, and verify Corollary 16 for this case. Identify those codewords with the 
lines of Fig. 2.12. 


Exactly the same proof as that of Theorem 15 establishes: 


Theorem 17. Let € be a perfect e-error-correcting code of length n, with e odd, 
and let € be obtained by adding an overall parity check to €. Then the 
codewords of weight 2e +1 in € form a Steiner system S(e * 1, 2e - 1, n), and 
the codewords of weight 2e +2 in € form an S(e +2,2e+2,n +1). 


Generalizations of this Corollary and much more about codes and designs 
will be given in Ch. 6. 


Problem. (11) If H, is a Hadamard matrix of order n > 8, let S be defined as in 
Problem 6. Show that if —l's are replaced by O's the rows of S form a 
symmetric 


n n 
2- (n-1.5- 15-1) 


design, and conversely. 


$6. An introduction to the binary Golay code 


The Golay code is probably the most important of all codes, for both 
practical and theoretical reasons. In this section we give an elementary 
definition of the code, and establish a number of its properties. Further 
properties will be given in Chs. 16 and 20. 
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Definition. The extended Golay code ,, has the generator matrix shown in 
Fig. 2.13. 


«————————————— | 





1 r > 








x|0 12345678 9 10|=|01234567 89 10 row 
11 11 111 0 
1 l 1 1 111 l 
l 1 1 1 1l 111 2 
1 l 1 1 1 1 1 3 
1 1 1 11 1 4 
G= il 1 1 1 1 5 
1 1 1 1 1 1 6 
1 1 11 1 l 7 
r 1 111 1 8 
1 1 111 1 9 
1 1 1 1 11 10 
1j1 1 11 1 1 1 1 11 








Fig. 2.13. Generator matrix for extended Golay code 44. The columns are labelled 
l.loli +++ liorzros ss rio. The 11 x 1] matrix on the. right is Aj). 


The 11 x 11 binary matrix An on the right of the generator matrix is obtained 
from a Hadamard matrix of Paley type (cf. Figs. 2.1 and 2.5). This implies 
that the sum of any two rows of A., has weight 6. Hence the sum of any two 
rows of G has weight 8. 


Lemma 18. $2, is self dual: Gos = Gra. 


Proof. If u and v are (not necessarily distinct) rows of G, then wt (u * v)=0 
(mod 2). Therefore every row of G is orthogonal to all the rows, and so 
€, C €. But G has rank 12, so 4,4 has dimension 12, and therefore $24 = Gi. 


Q.E.D. 
Lemma 19. (i) Every codeword of Gz, has weight divisible by 4. 
(ii) 4 contains the codeword 1. 
Proof. (i) Problem 38 of Ch. 1. (ii) Add the rows of G. Q.E.D. 


Lemma 20. G24 is invariant under the permutation of coordinates 
T = (dete) (loro) (lirio)(l2rs) foros (hor), 


which interchanges the two halves of a codeword. To be quite explicit: 
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If €. contains the codeword |L|R| with L = a.aodia2,...,a., R= 
b.bobib2,..., bio, it also contains the codeword | L' | R'] where 


L' = b.bobibs, ..., bi, R' = a«aodids, . . . , di. 


Proof. T sends row 0 of G into 
0, 1, 0100011101, 1, 1, 0000000000 


which is easily verified to be the sum of rows 0,2,6,7,8,10, 11, and therefore 
is in the code. Similarly for rows 1 through 10. T sends the last row of G into 
170", which is the complement of the last row, and is in the code by Lemma 
19. Q.E.D. 


Remark. This lemma implies that whenever €. contains a codeword | L | R | 
with wt (L)- i, wt (R) 7 j, then it also contains a codeword |L'|R'| with 
wt (L') 7 j, wt (R’) =i. 


The possible weights of codewords in $24 are, by Lemma 19, 
0, 4, 8, 12, 16, 20, 24. 


If u has weight 20, then u+1 has weight 4. We show that there are no 
codewords of weight 4, hence none of weight 20. 


Lemma 21. G2, contains no codewords of weight 4. 


Proof. For any codeword |L|R| of 4, wt(L)=wt(R)=0 (mod2). By 
lemma 20 we may suppose that a codeword of weight 4 is of one of the types 


(1) wt (L) = 0, wt (R) = 4; (2) wt (L) = 2, wt (R) = 2. 


(1) is impossible, since if wt (L) = 0, wt (R) = 0 or 12. (2) is impossible, since if 
wt (L) 7 2, L is the sum of one or two rows of G, plus possibly the last row. In 
each case wt (R) — 6 by the paragraph preceding Lemma 18. 

Thus the weights occurring in $2, are 0, 8, 12, 16, 24. Let A; be the number 
of words of weight i. Then Áo = Ans = 1, As = As. To each left side L there 
are two possible right sides, R and R. If wt(L)=0, then wt(R) z 4 (by 
Lemma 21) and wt (R) #8 (or else wt (R) = 4, again violating Lemma 21), so 
wt (R) « 0 or 12. If wt(L) 72, then wt (R) 7 6 by a similar argument. Pro- 
ceeding in this way we arrive at the following possibilities for codewords in 
€: 
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Number wt(L) wt(R) wt(R) total weight 


l 0 0 12 0 12 

1] 4 (3) 2 6 6 8 8 
(3) 4 (2) 4 4 8 8 12 
a=? 6 2 10 8 16 
B=? 6 6 6 12 12 
(5) (9) 8 4 8 12 16 
oc 10 6 6 16 16 
1 12 0 12 12 24 


But by Lemma 20, a is equal to the number of vectors of type (2, 6), which 
11 
2(11 +( 2) 
p 11 ID, (1. 
A - (ne (1) (1) (4) 759 


and so An = 2576. Thus we have shown that the weight distribution of $% is: 


i: 0 8 12 16 24 
Ai: 1 759 2576 759 1 


is 


Therefore 


Theorem 22. Any binary vector of weight 5 and length 24 is covered by 
exactly one codeword of G24 of weight 8. 


Proof. If a vector of weight 5 were covered by two codewords u, v of weight 8, 
then dist (u, v) <6, a contradiction. Each codeword of weight 8 covers (5) vectors 
of weight 5, which are all distinct, and 


759(8) - ini Q.E.D. 


Corollary 23. The codewords of weight 8 in the extended Golay code G2, form a 
Steiner system S(5,8,24). Hence by Corollary 14 we get Steiner systems 
S(4, 7, 23), S(3,6,22), and S(2,5,21). 


We shall refer to codewords in 4. of weights 8 and 12 as octads and 
dodecads respectively. 


Theorem 24. The block intersection numbers A for the Steiner system formed 
by the octads are shown in Fig. 2.14. 
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ilo 759 
l 506 253 
jd 330 176 77 
3 210 120 56 21 
4 130 80 40 16 5 
5 78 52 28 12 4 ] 
6 46 32 20 8 4 0 l 
7 30 16 16 4 4 0 0 l 
8 30 0 16 0 4 0 0 0 l 


ilo 2576 
l 1288 1288 
2 616 672 616 
3 280 336 336 280 
4 120 160 176 160 120 
5 48 72 88 88 72 48 
6 16 32 40 48 40 32 16 
7 0 16 16 24 24 16 16 0 
8 0 0 16 0 24 0 16 0 0 


Fig. 2.15. Generalized block intersection members A, for the dodecads in G4. 


Corollary 25. The codewords of weight 16 in G2, form a 5-(24, 16,78) design, 
and the corresponding Pascal triangle is obtained by reflecting Fig. 2.14 about 
the middle and omitting the last three rows. 


Proof. This is the complementary design to that formed by the octads. 
Q.E.D. 


Theorem 26. The dodecads in :, form a 5-(24, 12,48) design. Suppose 
Pis P form an octad. Temporarily misusing the notation, let Aj be the 
number of dodecads containing P,,..., P, and not containing P,.,,..., P., for 
0=<j=<i<8. Then the à; are as shown in Fig. 2.15. 


Proof. We first show that Ac = À= Ase — 16. In fact there is an obvious 
l-I-correspondence between dodecads containing P,,...,P. (and which 
cannot contain P: or Ps), and octads containing P- and Ps but not containing 
P,,...,P.s. Therefore Aw=16 from Fig. 2.14. From this it follows that 
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3x 16=48 dodecads contain P,,...,Ps, thus As; = 48. Therefore the dode- 
cads form a 5-design. The rest of the table can now be filled in from Theorem 
9 and the Pascal property. Q.E.D. 


In Chapter 6 we shall give a sufficient condition for codewords of a fixed 
weight in any code to form a t-design, which includes these results as a 
special case. 


Problems. (12) Show that the Golay code also gives rise to the following 
designs: 

v k b Ai A2 A3 Aa 

23 7 253 7721 5 1 

23 8 506 176 56 16 4 

226 77 21 5 |] 

227 176 S6 16 4 

22 8 330 120 40 12 


(13) Show that the 4096 cosets of $, have the following weight distributions: 


Number Weight 024 6 8 10 12 14 16 18 20 22 24 
l l 759 2576 759 l 
276 1 77 352 946 1344 946 352 77 l 
1771 6 64 360 960 1316 960 360 64 6 
Number Weight 13 5 7 9 11 13 15 17 19 21 23 
24 1 253 506 1288 1288 506 253 l 
2024 1 21 168 640 1218 1218 640 168 21 1 


Definition. The (unextended) Golay code of length 23, G., is obtained by 
deleting the last coordinate from every codeword of G4. We shall see in Ch. 
16 that deleting any coordinate from $4, gives an equivalent code. 


Theorem 27. $5 is a [23, 12, 7] code, with weight distribution 


io 7 8 I| 12 15 16 23 
Ai 1 253 506 1288 1288 506 253 | 


Proof. This follows immediately from Fig. 2.14. For example the codewords 
of weight 7 in %., are the octads in G2, with last coordinate equal to 1: the 
number of these is Ay, = 253. Q.E.D. 


Theorem 28. The Golay code G» is a perfect triple-error-correcting code. 
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Proof. Since the code has minimum distance 7, the cosets of minimum weight 
<3 are disjoint (by Theorem 2 of Ch. 1). The number of such cosets is 


BB) 033. 193. aer oes 
i00 G2 cus 


so this includes all cosets of G,;. Therefore €. is perfect. Q.E.D. 


$7. The Steiner system S(5, 6, 12), and nonlinear single-error-correcting codes 


Lemma 29. The nonlinear code consisting of the rows of A (see Fig. 2.13), the 
sums of pairs of rows of Ay, and the complements of all these, is an 
(11, 132,3) code. 


Proof. There are 
11 
ius (1) =132 

codewords. A typical codeword has the form a+b or a +b +1, where a,b 
are distinct rows of A, (and b may be zero). To show that the distance between 
codewords is at least 3, we must check 4 cases. (i) dist(a+b,a+c)= 
wt (b +c) = 6, by the paragraph proceeding Lemma 18. (ii) Similarly dist (a + 
b,a+c+1)=5S. (iii) dist (a +b, c d) - wt(a +b +c +d)> 4. For if it were 
less than 4, the corresponding codeword of 4, would have weight less than 8, a 
contradiction. (iv) Similarly dist (a * b, c +d +1)> 3. Q.E.D. 


Theorem 30. The codewords of the extended (12, 132, 4) code form the blocks 
of a Steiner system S(5, 6, 12). 


Proof. Any vector of weight 5 can be covered by at most one block, or else there 
would be two codewords at distance less than 4 apart, which is impossible. 
Therefore the 132 blocks cover 132.6 = ($) distinct vectors of weight 5. Thus 
every vector of weight 5 is covered by exactly one block. Q.E.D. 


From Corollary 14 we obtain Steiner systems S(4, 5, 11), S(3,4, 10), and 
S (2, 3, 9). 


Nonlinear single-error-correcting codes. It will be convenient to rearrange the 
coordinates of the (12,132,4) code of Theorem 30 so that it contains the 
codeword (or block) 111111000000. Let the coordinates be labelled 
Xi... Xi. We may now include the 12 vectors of Fig. 2.16 in the code 
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without decreasing the minimum distance, and obtain a (12, 144, 4) code, 
which we denote by €12. 


Xi X2 Xa Xa Xs Xe X7 Xg Xo Xi Xn Xu 
11000000000 0 
001 100000000 
0000000000 1 I 
06 0 ! 1 ] ! ! !] 1 ] 1! 1] 
1100 1 1 | t i 1 | I 
11 1 l 1 1] 11 11 0 0 


Fig. 2.16. The 12 extra codewords. 
Lemma 31. Some codeword c in €n is at distance 4 from 51 other codewords. 


Proof. We take c = 111111 000000. The block intersection numbers A, for 
$ (5, 6, 12) are as follows. 


132 
66 66 
30 36 30 
12 18 18 12 
4 8 10 8 4 
l 3 5 5 3 |] 
1 0 3 2 3 0 1 


The blocks at distance 4 from c must meet c in exactly 4 places. The number 
of such blocks is (%)As.4 = 15.3 = 45. Six of the codewords of Fig. 2.16 are also 
at distance 4 from c, for a total of 51. Q.E.D. 


(It is clear from this proof that no codeword is at distance 4 from more than 
5] codewords.) 

We shall construct 4 shorter codes from €n: 

(i) Taking the codewords in €,2 for which xi» 1, we obtain an (11, 72, 4) 
code €,. A similar argument to that of Lemma 31 shows that there is a 
codeword in €,, at distance 4 from 34 others. 

(ii) Taking the codewords in €, for which x= 1, xio— 0, we obtain a 
(10,38, 4) code 65%. There is a codeword in €% at distance 4 from 22 others. 

(iii) Taking the codewords in €: for which xi;— xu, — l, we obtain a 
(10, 36, 4) code €% which contains 0. There is a codeword at distance 4 from 
30 others. 

(iv) Taking the codewords in €, for which xi, = xu = l, Xio = 0, we obtain a 
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(9, 20,4) code €, containing 0. There is a codeword at distance 4 from 18 
others. (Another (9, 20,4) code was given in $4.) This code becomes an 
(8, 20,3) code if one coordinate is deleted. 

An alternative construction for an (8, 20, 3) code is shown in Fig. 2.17. Note 
that this code is cyclic, in the sense that a cyclic shift of any codeword is 
again a codeword! l 

Since the number of codewords is not a power of 2, none of these codes 
are linear. As mentioned in §4 the (9, 20, 4) code is optimal. 


Research Problems (2.4). Are the (10, 38, 4), (11, 72, 4). (12, 144, 4) codes op- 
timal? At present the best bounds known are A(10,4) « 40, A(11,4) « 80 and 
A(12, 4) < 160 (see Ch. 17 and Appendix A). 

(2.5) Generalize the (12, 144, 4) code by finding other good nonlinear codes 
from the rows of a Hadamard matrix taken 1,2,... at a time. 

(2.6) Generalize Fig. 2.17 by finding other good nonlinear cyclic (n, M, d) 
codes with n greater than 2d. 


00000000 
11010000 
01101000 
00110100 
00011010 
00001101 
10000110 
01000011 
10100001 
11100100 
01110010 
00111001 
10011100 
01001110 
00100111 
10010011 
11001001 
10101010 
01010101 
11111111 


Fig. 2.17. An (8, 20, 3) code. 


Because the 12 codewords of Fig. 2.16 can be permuted in many ways, 
there are several inequivalent versions of the codes o,..., G12. 
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$8. An introduction to the Nordstrom-Robinson code 


The extended Golay code 4, may be used to construct some interesting 
nonlinear double-error-correcting codes. 


The construction. It will be convenient to change the order of the columns of 
G,, so that G4 contains the codeword 1111111100...0— 1°0'°. Let G bea 
generator matrix for the new version of G. 

Now the first 7 columns of G are linearly independent, for otherwise $s, 
would contain a nonzero codeword of weight <7. But G = 5, so this is 
impossible. Thus the first 7 coordinates may be taken as information symbols, 
and the 8" coordinate is the sum of the first 7. 

We divide up the codewords according to their values on the first 7 
coordinates: there are 2’ possibilities, and for each of these there are 2"/2' = 
32 codewords. Thus there are 8 x 32 = 256 codewords which begin either with 
seven 0’s (with 8" coordinate 0), or with six 0’s and a 1 (with 8^ coordinate 1). 


Definition. The Nordstrom- Robinson code Ns is obtained by deleting the first 
8 coordinates from these 256 vectors. (See Fig. 2.18, where Wi. is enclosed 
within the double lines.) 


Theorem 32. The Nordstrom Robinson code Ni. is a (16, 256, 6) code. 













0000000 





1000000 


geen al 








NORDSTROM- 
ROBINSON CODE 


0100000 




















Em - 


LIST OF ALL CODEWORDS IN GOLAY CODE 


Fig. 2.18. Construction of the Nordstrom-Robinson code. 
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Proof. Let a,b be distinct codewords of Wis, obtained by truncating the 
codewords a’, b’ of G4. Since dist (a’, b’) = 8, dist (a, b) =6. Q.E.D. 


Observe that Wi. is made up of a linear [16, 5, 8] code 99, say (obtained from 
the codewords of €, which are zero in the first 8 coordinates), plus 7 of its 
cosets in Gs. In Ch. 15 we shall give several generalizations of the Nord- 
strom-Robinson code, namely the Kerdock and Preparata codes, which have 
this same kind of structure. 


Problem. (14) Show that Wie is not linear. 


Since €. contains 1%, the subcode B contains 1'5. All the other codewords 
of B must then have weight 8. Therefore the weight distribution of B is 


i 0 8 16 
A; 1 30 L 


Consider one of the cosets of B in Nis, say B = v; + B. Clearly the only 
weights which can occur in 9$; are 


6= 8-2, 10 = 12-2, 14= 16-2 


Since 1 € &, if B; contained a vector of weight 14, it would also contain one 
of weight 2, which is impossible. Also As = Ai. Therefore the weight distri- 
bution of 8; is 

i 6 10 

A; 16 16 


(We shall see in Ch. 14 that @ is a first-order Reed Muller code, and the 
vectors in B; are bent functions.) Putting all this together we find that the 
weight distribution of Wis is as shown in Fig. 2.19. 


i 0 6 8 10 16 
A 1 112 30 112 1 


Fig. 2.19. Weight distribution of Nordstrom-Robinson code N16. 


Problems. (15) Show that if c is any codeword of Nw, the number of 
codewords at distance i from c is also given by Fig. 2.19. Thus Wie is distance 
invariant. (Hint: look at the weight distribution of the translated code c + Wis, 
and use Fig. 2.18). 

Thus we know that the distance distribution of Nis is equal to its weight 
distribution and is given by Fig. 2.19. 

By shortening Wis we obtain (15, 128, 6), (14, 64, 6) and (13, 32, 6) codes. All 
four are optimal (Ch. 17). It is known that the (16,256, 6), (15, 256, 5), 
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(15, 128, 6), (14, 128, 5), (14, 64, 6), (13, 64,5) and (13, 32, 6) codes are unique, 
but the (12, 32, 5) shortened code is not. 

The following problems give alternative constructions of a (12, 32, 5) code 
and of Wi. 

(16) (Van Lint) 


100 111 010 
Let I ={ 010 J=([111 P ={ 001 Q=P?’ 
001/" 1117? 1007’ 


= P" =J-—I-—P. Show that 0 and the rows of the matrices a, B, y, 6 form a 
(12, 32, 5) code, where 


J-I I I I JP 
I J-1 I I E ie 
I I J-01 if IQ 
I I! I! J-I QI 

y-U-LJ-LJ-LJ-D, 


000 111 111 111 
111 000 111 111 
111 111 000 111} 
111 111 111 000 


(17) (Semakov and Zinov'ev.) Let A, B, C, D be 4x4 permutation 
matrices such that A+B+C+D=J, let K- A* D L=A+C and M= 
A+B. Show that the rows of ai,...,a@ , and their complements form a 
(16, 256, 6) code, where 


a= 


$= 


AAAA ABCD ACDB 
a=|\A44A4 aœ -|BĀDC a=|C ABD 
AAA A? 2"1C DA BP ^ DBAC? 
AAAA DCBA BDCA 
ADBC OKLM OLMK 
a.=|DPACB a,=|K OML a | OKM 
BCAD" LMoO K? “IM K OL? 
CBDA MLKO K MLO 
OMKL PPPP 0000 
aœ- MOLK REEL ae p-[0101 
KLOMwy * IPPPPV" 0011? 
LKMO PPPP 0110 


and the bar denotes complementation. 


Research Problem 2.7. Generalize the construction of Problems 16 and 17. 
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89. Construction of new codes from old (IID 


The direct sum construction. Given an (n,,M,,d:) code €, and an 
(n:, Mz, dı) code €, their direct sum consists of all vectors* | u |v |, where 
u E€, v€ €, This is clearly an (nı +n, M,M>, d = min {d,, d2}) code. Al- 
though simple, this construction is not very useful. More intelligent is: 


The |u| u * v | construction. Given an (n, Mi, di) code €,, and an (n, M2, d2) 
code €+, with the same lengths, we may form a new code €; consisting of all 
vectors 





[u|u v]. u € €, v € €2. 


Theorem 33. €, is a (2n, MiM,, d = min {2d,, d;)) code. 


Proof. Let a -|u[u +v |, b -|u'|u' + v'| be distinct codewords of €;, where 
u.u'€ €, v,v'€ €, If v—v' then dist (a, b) 22dist(u, u') z2d,. Now sup- 
pose vx v’. Then 


dist (a, b) = wt (u —u') -wt(u—u' t v—v!) 
zwt(u-—u)-cwt(v—v)-—-wt(u —u), 
by Problem 8 of Ch. 1 
= wt (v — v') > dp. Q.E.D. 


This construction builds up good codes very quickly: 


Examples. Taking €, = [4, 3,2] even weight code, €. = [4, 1, 4] repetition code, 
we get €; = [8, 4, 4] first order Reed-Muller code. Again with ©, = this [8, 4, 4] 
code, €, — [8, 1, 8] code, we get €, = [16, 5,8] first order Reed-Muller code. In 
fact all Reed-Muller codes can be built up in this way, as we shall see in Ch. 
13. 


Remarks. (a) If €, and €; are of different lengths the construction still works 
if we add enough zeros at the end of the shorter code. 

(b) If €, and €, are linear, say €; = [m, ki, di], €2 [n;, kz, d2], then €3 is an 
[n; + max (n;, n;), ki + k;, d = min {2d,, d,}] linear code. 


*If u =U... Um v= 0,,..0, then |u|v| denotes the vector u,... 44v, ... v, Of length m +n. 
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An infinite family of nonlinear single-error-correcting codes. Starting with the 
(8,20,3),...,(11,144, 3) codes given in $7, we can construct an infinite 
family: 
€, € > €. 
(8, 128, 2), (8, 20, 3) > (16,32 - 2!', 3) 
(9, 256, 2), (9, 20, 4) > (18, 2 - 27,4) 2 (17, 2 - 27, 3) 


(12, 2", 2), (12, 144, 4) > Q4, i - 2", 4) > Q3, i$: 2,3) 
(16, 27,2), (16, 38 - 2", 3) > (32, # - 2%, 3) 


Continuing in this way we obtain 


Theorem 34. For any block length n satisfying 2" <n <3.2""' there exists a 
nonlinear (n, à -2""""', 3) code, where A = %, is OF ie. 


Remarks. It follows from the sphere-packing bound (Theorem 6 of Ch. 1) that 
the largest single-error-correcting linear code of the same length, which is an 
[n,n — m — 1,3] shortened Hamming code, has only 2""*' codewords. 


Vasil'ev nonlinear single-error-correcting codes. The Hamming codes that 
were constructed in $7 of Ch. 1 are unique: any binary linear code with the 
parameters [n=2"-1,k=n-—m,d=3] is equivalent to a Hamming 
code. This is so because the parity-check matrix must contain all 2"—1 
nonzero binary m-tuples in some order. 

But this is not true if the assumption of linearity is dropped. 


Problems. (18) Vasilev codes. Let € be an (n-2"7—1,M-2"",d-—3) 
perfect single-error-correcting binary code, not necessarily linear. Let A be 
any mapping from € to GF(2), with A(0) —0, which is strictly nonlinear: 
A(u t v) A(u) - A(v) for some u,v E €. Set m(u)=0 or 1 depending on 
whether wt (u) is even or odd. 

Show that the code 


Y ={lulut+ovlr(u)+a(v)|:u E F", ve €) 


is a (2"*'~1,27°-",3) perfect single-error-correcting code which is not 
equivalent to any linear code. Show that such codes exist for all m > 3. 
(19) Let €, (i-1,2) be [n,k] linear codes. Show that (i) if €= 
{lujul: u € €) then €* = fJala + v|: a arbitrary, v € €i}, (ii) if € = {lulu + v|: 
UEC, v€«€) then € -l[(la-b|lb|; aeGi, beg} (i) if €- 
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{lat+x|b+xla+b+x|: a,b E €, x E € then €'={lu+wlot+wlutovt w|: 
u,v E Ci, wE G}. 


Notes on Chapter 2 


$1. The epigraph is from [876]. Plotkin [1064] was the first to seriously study 
nonlinear codes. That the largest double-error-correcting linear code of length 
11 contains 16 codewords follows from Helgert and Stinaff’s useful table 
[636]. For connections between codes and sphere-packing see Leech and 
Sloane [810]. 


$2. Theorem 1 and Corollary 4 are due to Plotkin [1064]. 


$3. References on Hadamard matrices are Baumert [81], Bussemaker and 
Seidel [221], Goethals and Seidel [501], Golomb [522], Hadamard [572], Hall 
[583, 586-588, 590], Van Lint [853], Paley [1019], Ryser [1136], Sylvester 
[1296], Thoene and Golomb [1320], Todd [1328], Turyn [1345-1347], Wallis et 
al. [1386] and Whiteman [1413]. 

For properties of quadratic residues see LeVeque [825, I, Ch. 5], Perron 
[1035], Ribenboim [1111], and Uspensky and Heaslet [1359, Ch. 10]. 

Plotkin [1064] and Bose and Shrikhande [187] construct binary codes from 
Hadamard matrices. Semakov et al. [1179-1182] have generalized these 
constructions to fields with q elements. Theorem 8 is from Levenshtein [819]. 
The largest possible [n, k, d] linear code in the region n «2d is given in 
Theorem 27 of Ch. 17. 


$4. Conference matrices were introduced by Belevitch [95-97]. Other valu- 
able references are Delsarte et al. [366], Goethals and Seidel [500], Van Lint 
[853], Van Lint and Seidel [856], Paley [1019], Turyn [1346], and Wallis et al. 
[1386]. The codes C, were given by Sloane and Seidel [1238]. Conference 
matrices have been used to construct Hadamard matrices (Paley [1019], 
Raghavarao [1085, $17.4]), in network theory (Belevitch [95-97]), in weighing 
designs (Raghavarao [1085, Ch. 17]), and in studying strongly regular graphs 
(Seidel [1175-1177]). 


$5. References for t-designs are Alltop [21-24], Bose [173, 177], Carmichael 
[250], Collens [300], Dembowski [370], Doyen and Rosa [386], Hall [585, 587], 
Hanani [595-600], Hughes [673], Hughes and Piper [675], DiPaola et al. 
[1020], Raghavarao [1085], Rokowska [1122], Ryser [1136], Stanton and Col- 
lens [1266], Vajda [1360, 1361], Wilson [1420-1422] and Witt [1423, 1424]. 


$6. G was discovered by Golay in 1949 [506]. For the extensive literature 
about this code see Ch. 20. 
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$7. This construction of S(5,6,12) is due to Leech [803]. This design is 
unique - see Problem 19 of Ch. 20. €,— €,, were discovered by Golay [509] 
and Julin [701]. The constructions given here are from Leech and Sloane [810] 
and Sloane and Whitehead [1239]. 


$8. Mis was found by Nordstrom and Robinson [1002] (see also [1119, 1253]) 
and independently discovered by Semakov and Zinov'ev who gave the 
alternative construction of Problem [7 in [1180]. The construction from the 
Golay code is due to Goethals [493] and Semakov and Zinov'ev (1181]. The 
(13, 64, 5) and (12, 32, 5) codes were found by Stevens and Bouricius [1277] in 
1959, and rediscovered by Nadler [982] and Green [556]. The (185. 256, 5), 
(14, 128, 5), (13,64, 5) and (12,32, 5) codes contain twice as many codewords 
as any linear code with the same length and minimum distance, since from 
Fontaine and Peterson [434] or Calabi and Myrvaagnes [228] there is no 
[12, 5, 5] linear code. Problem 16 is due to Van Lint [851]. For the uniqueness 
of these codes see Goethals [497] and Snover [1247]. |: 


89. The |u|u + v| construction was apparently first given by Plotkin [1064], 
and rediscovered by Sloane and Whitehead [1239]. ‘See also Liu et al. [858], 
where a generalization is used to construct the Nordstrom-Robinson code. 
Theorem 34 .is from [1239], where other applications of this construction will 
be found. Problem 18 is from Vasil'ev [1366]. 





An introduction to BCH codes 
and finite fields 


$1. Double-error-correcting BCH codes (I) 


Hamming codes, we saw in Chapter 1, are single-error-correcting codes. 
The codes which in some sense generalize these to correct t errors are called 
Bose-Chaudhuri-Hocquenghem codes (or BCH codes for short), and we 
introduce them in this chapter. We shall also introduce one of the central 
themes in coding theory, namely the theory of finite fields. 

We begin by attempting to find a generalization of the Hamming codes 
which will correct two errors. 

The (binary) Hamming code of length n — 2" — | needed m parity checks to 
correct one error. A good guess is that 2m parity checks will be needed to 
correct two errors. So let's try to construct the parity check matrix H' of the 
double-error-correcting code, by adding m more rows to the parity check 
matrix H of the Hamming code. 

As an example take m — 4, n — 15. Then H has as columns all nonzero 
4-tuples: 


which we abbreviate to 
H =[1,2,3,..., 14, 15], (1) 


where each entry i stands for the corresponding binary 4-tuple. We are going 
to add 4 more rows to H, say 


is 1 2 3 15 
H -[« f(2) fQ-- ius 2 
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where each f(i) is also a 4-tuple of 0’s and 1’s. The i^ column of H’ is 


i 
g, - (,/ J 3 
(ro (3) 
a column vector of length 8. 
How do we choose f(i)? Suppose 2 errors occurred, in positions i and j. 
The syndrome (from Theorem 4 of Ch. 1), is 


S= H; +H, 
- itj ) 
(i) + f() 
= [S sa 
Z2 y. 
We must choose f(i) so that the decoder can find i and j from S, i.e., can 
solve the simultaneous equations 
i+j=z, 
fM+f/D = z (4) 
for i and j, given z, and z;. But all of i,j, f(i), fQ), zi, z2 are 4-tuples. 
In order to solve these equations we would like to be able to add, subtract, 
multiply and divide 4-tuples. In other words we want to make 4-tuples into a 


field. We next describe the construction of this field and then return to the 
problem of finding double-error-correcting codes 


Definition. A field is a set of elements in which it is possible to add, subtract, 
multiply and divide (except that division by 0 is not defined). Addition and 
multiplication must satisfy the commutative, associative, and distributive 
laws: for any a, B, y in the field 


atB-f-a, af = Ba, 


at+(Bt+y)=(a+B)t+y, a (By) = (aB)y, 
a(B+y)=aBt+ay: 


and furthermore elements 0, 1, —o, a^! (for all a) must exist such that: 


O0+a =a, (-a)+a =0, Oa = 0, 


la =a, and if a #0, (a ‘Ja = 1. 


A finite field contains a finite number of elements, this number being called 
the order of the field. Finite fields are called Galois fields after their 
discoverer. 
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§2. Construction of the field GF(16) 


The 4-tuples of O's and 1’s can clearly be added by vector addition, and in 
our case subtraction is the same as addition. Furthermore aja,a.a,+ 
49@,a.a,=0. We must however define a multiplication. To do this we as- 
sociate with each 4-tuple a polynomial in a: 


4-tuple Polynomial 
0000 0 
1000 l 
0100 a 
1100 l+a 
0010 a? 
1010 l+a? 


0001 a 
1111 Ilt+ata’t+a’ 
Multiplying two of these polynomials will often give a polynomial of degree 
greater than 3, i.e., something which is not in our set of objects. E.g. 
1101 . 1001 & (1 +a t a?)(134 aà)-71*a*a*-a*. 


We want to reduce the answer to a polynomial of degree <3. To do this we 
agree that œ will satisfy a certain fixed equation of degree 4; a suitable 
equation is 

T(a)71*a*a*^-0 or a'-l-a. 


Then a’ =ata’, a? - a? t a^, so 
lt+atat+a°=lt+atlt+ata’ta'=a’ta’. 
This is equivalent to dividing by a*+a+1 and keeping the remainder: 


3 


a” +1 


a+ a+ I)a" +a‘ +atl 
a* toa'ta? 


a^*toa'*toa^-ta-cl 
a‘ +atl 


a ta = remainder 
Thus the product 1101 times 1001 is 
(1-*ac*oa)(1-a)-71-a-c*a**a*-(lta?m(a)ta?^*a! 
—a^-a!since (a) - 0 
< 0011 
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Another way of describing this process is that we reduce a product of 
polynomials modulo 7 (a): 


l*oata*-*a*-(l*a?)m(a)ta'*a' 


z aca! mod m(a). 


Similarly a= 1 * a mod m(a), etc. 
Now if this multiplication is to have an inverse, which it must If our system 
is to be a field, m(x) must be irreducible over GF(2). 


Definition. A polynomial is irreducible over a field if it is not the product of 
two polynomials of lower degree in the field. Loosely speaking an irreducible 
polynomial is like a prime number: it has no nontrivial factors. Any 
polynomial can be written uniquely (apart from a constant factor) as the 
product of irreducible polynomials (just as any number can be written 
uniquely as the product of prime numbers). We shall see in a moment that 
x*+x +1 is irreducible over GF(2). 


Theorem 1. If a(x) is irreducible, then every nonzero polynomial B(a) of 
degree <3 has a unique inverse B(a) ! such that 


B(a).B(a) '=1 mod z(a). 


Proof. Look at the products A(o)B(a) where A(a) runs through all. the 
polynomials 
l;,a,octl,a..., a^-alta-cl (5) 


of degree «3. These products must all be distinct mod 7 (a), for if 
A.(a)B(a) = Ax(a) B(a) mod (a) 


then z(a)|(Ai(a)—- AXoa))B(a) and (since (a) is irreducible) either 
m(a)| Ai(a)— Axo) or  m(o)|B(a). Because the degrees of 
Ai(a), Axa), B(a) are less than the degree of z(a) this can only happen if 
A,(a) = Ax(a). Thus all the products A(a)B(a) are distinct, and so they must 
also be equal to (5) in some order. In particular for just one A(a), A(a)B(a) = 
l, and A(a)= B(a) '. Q.E.D. 


Example. (i) To find the inverse of a, note that |!=a+a*=a(l+a’) so 
a '-]l-ta! 

(ii) To find the inverse of a + a^, suppose it is du + a,a + asa? + aio. Then 
(a + a?Y(a«* aia + aso? + aux?) = 1. which implies 
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a+ a,=1 
Ag+ à;—0 
ao+a,+a,=0 
a,+a,=0 


whose solution is a» = a, = a: = 1, a: = 0. Therefore the inverse is 1 +a +a’. 
Logarithm tables (e.g., Fig. 3.1) make finding an inverse much easier, as 
will be explained below. 


Division. To find AJB, first find the inverse B~’ = 1/B and then use the rule 


We must check that w(x) = x*+x+1 is irreducible. It has degree 4 and so, 
if not irreducible, contains a factor of degree 1 or 2. The only polynomials of 
degree 1 are x and x + 1. Clearly x\a(x). If x +1] a(x) then z(—1) =0. But 
m(-1)=14+14+1=1...x+1\ r(x). What about a factor of degree 2? We can 
rule out x^ + x and x; + 1 = (x + 1}. This only leaves x? + x + 1, which we test by 
a division using detached coefficients: 


1 = remainder 


Therefore x*+x +1 is irreducible. 

Thus we have made the 16 4-tuples of 0’s and 1’s into a field. This is called 
the Galois field of order 16, abbreviated GF(2*) or GF(16). The field elements 
can be written in several different ways, as shown in Fig. 3.1. 

We note that the nonzero elements of the field form a cyclic group of order 
15 with generator a, where a‘ = 1; and that we have been fortunate enough to 
choose an irreducible polynomial m(x) which has a generator of this group as 
a zero (cyclic groups are defined on p. 96). 

a (or any other generator of this cyclic group) is called a primitive element 
of GF(2*). For example a, a’, a* are primitive but a^, a^ are not. A 
polynomial having a primitive element as a zero is called a primitive 
polynomial. Not all irreducible polynomials are primitive, e.g. x*+x°+x?°+ 
x+1 is irreducible, so could be used to generate the field, but is not a 
primitive polynomial. 

Any nonzero element y of GF(2*) can be written uniquely as a power of a, 
say 


y=a' for0xixl4. 
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asa asa asa 
4-tuple polynomial power of a logarithm 

0000 0 0 — oo 
1000 1 1 0 
0100 a a 1 
0010 a? a? 2 
0001 a? a` 3 
1100 l+a a* 4 
0110 acta? a^ 5 
0011 a? +a’ a? 6 
1101 Ilt+a+t+a’ a’ 7 
1010 l+a’ a" 8 
0101 ata’ a? 9 
1110 Ilt+at+a’ a’ 10 
0111 ata’+a’ a" 11 
1111 l+a +a’ +a’ a’? 12 
1011 lt+a’t+a’ a? 13 
1001 l+a?’ a’ 14 


Fig. 3.1. GF(2*) generated by a*+a+1=0. 


(Of course a? = a'*=1.) Then i is called the logarithm (or sometimes the 


index) of y. It is convenient to say that 0— a ^ 
It is helpful to think of the first representation (columns 1 and 2 of Fig. 3.1) 


as resembling the representation of a complex number z in rectangular 


coordinates: : 
z=x+1y, 


and the second (columns 3 and 4) as the representation in polar coordinates: 
z= re". 


The rectangular representation is best for addition, while the polar represen- 
tation is best for multiplication. 
Indeed, to multiply two field elements, take logarithms and add, remem- 
bering that a^ = 1. (So that the logarithms are manipulated mod 15.) 
Example: To multiply 0111 and 1111: 


element log 
0111 11 
1111 +12 
23 
But a^ = 1, so the answer is OI11.1111 =a” =a” “=a* = 1010. 


To find a reciprocal: 
(1010)! =(a@")'!=a*=a*"*=a'=1101. 
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To find a square root: 
(0110)? = (a°)'? = (a)? = a’? = 1110. 


We shall see in Chapter 4 that any finite field can be constructed in exactly 
the same way, and has the property that the multiplicative group of nonzero 
elements is cyclic, with a primitive element as generator. We shall also see 
that the number of elements in a finite field is a prime power, and that there is 
essentially only one field with a given number of elements. 

In particular, if we take a(x) to be a primitive irreducible polynomial over 
GF(2) of degree m, we get the field GF(2") of all 2" binary m-tuples. 


$3. Double-error-correcting BCH codes (II) 


Now that our new field GF(16) makes it possible to do arithmetic with 
4-tuples, let's return to the problem of designing the double-error-correcting 
BCH code of length 15. How should we choose f(i) in Equation (4) so that 
these equations can be solved (in GF(16))? 

A bad choice would be f(i)- ci, where c is a constant. For then (4) 
becomes 


i+j=z, 
c(it+]) = 22 


which are redundant and can’t be solved. Another bad choice is f(i) = i’, for 
i?+j?=(i+j)? (mod 2), and (4) becomes 


i+j=z, 
(+j =z, 


which are also redundant. 
A good choice is f(i) =i’, for then (4) becomes 


i +j — zx 0 
P: (6 
Ut] =z 
which we can solve. We have 
z= 0+) =(i+ji?+ij+ j’) = z(zi + ij) 
eee She 
ij = Ti. (7) 
From (6), (7), i and j are the roots of 


x bx (2+2)=0 (z, # 0). (8) 
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Note that if there are no errors, z, = 2; = 0; while if there is a single error at 
location i, z= i = zi. Thus we have: 


Decoding scheme for double-error-correcting BCH code. Receive y, calculate 
the syndrome S = Hy" = (5) say. Then 


(i) If z,— z; — 0, decide that no errors occurred. 

(il) If z, #0, z; — zi, correct a single error at location i= z,. 

(iii) If z, #0, z: ¥ zi, form the quadratic (8). If this has 2 distinct roots i and 
j, correct errors at these locations. 

(iv) If (8) has no roots, or if z, — 0, z2#0, detect that at least 3 errors 
occurred. 

Let us repeat that i, j, zi, z; are all elements of GF(16), and that (8) is to be 
solved in this field. (Unfortunately the usual formula for solving a quadratic 
equation doesn’t work in GF(2*). One way of finding the roots is by trving 
each element of the field in turn, and another method will be described in §7 
of Ch. 9). 


The parity check matrix. Now let us rearrange the matrix H so that the first 


row is in the order 1, a, a”, a^... ; i.e., our matrix is 


5 a? a! ata? a" y" a? y"? qa 
a? a* a? a" ] | a? a? 


a^ a^ a? a? 9) 
This has the important advantage, as we shall see in Ch. 7. of making the code 
cyclic. 

Notice that not all powers of a appear in the second row: this is because a? 
is not a primitive element (since (a?) = 1). 

Expanding this in binary we obtain 


0 0 


= 


ooo =j — = — 
—- > © 


oo 


=- þ O Sjo >- 


—- -= O ojo — 
e 


c 
ooor|CceH = > 
©- Oo 
——— C 
—— D> — 
ou“ 


~ 
c 





(10) 


~ 
= 


=- Oj= > >20 
c 


c 
—co—c-- 
ee o Dle = m w 
=- On 


——o 
vee a 


ooo 

-— 

=e = |o 
=—_Oooo-—-c 
— e m |e 


Example of decoding procedure. First let's suppose two errors occurred, say 
in places 6, 8 (ie. in the columns (a, a`, (a", a"). Then z,= 1001 =a”, 
2;7 0100 = a, and (z//z)) + zi = a? + a" = 1001 = a". Thus the equation (8) for 
i,j is 

x t a"x-c a = (x a*y(x +a") 


and indeed the roots give the locations of the errors. 
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On the other hand, suppose three errors occurred, in places 0, 1, 3 say. 
Then z, = 1101 = æ’, z2= 1100= a^, and Equation (8) is 
x'ta'xta. 
By trying each element in turn the decoder finds that this equation has no 
zeros in the field, and so decides that at least three errors occurred. 
Nothing in our construction depends on the length being 15, and plainly we 


can use any field GF(2") to get a double-error-correcting BCH code of length 
2" — 1. The parity check matrix is 


lo a? a” ) 
à 


s 20-2 
la aes) git 


H= ( (11) 
where each entry is to be replaced by the corresponding binary m-tuple. 

The decoding scheme given above shows that this code does indeed correct 
double errors. We return to these codes, and construct t-error-correcting 
BCH codes, in Chapters 7 and 9. 


Problems. (1) Find the locations of the errors if the syndrome is S= 
(1001 0110)" or (0101 1111)". 
(2) If the received vector is 11000---0, what was transmitted? 


$4. Computing in a finite field 


Since elements of GF(2*) are represented by 4-tuples of O's and 1’s, they 
are easy to manipulate using digital circuits or in a binary computer. In this 
section we give a brief description of some circuits for carrying out com- 
putations in GF(2*). (A similar description could be given for any field 
GF(2”).) For further information see Bartee and Schneider [72], Peterson and 
Weldon [1040, Ch. 7], and Berlekamp [113, Chs. 1—5]. Good references for 
shift registers are Gill [485], Golomb [523], and Kautz [750]. 

The basic building blocks are: 


Xr os 


Storage element (or flip-flop), Binary adder* (output is 
contains a 0 ora 1. 1 iff an odd number of 
inputs are 1). Also 


jo ; called an EXCLUSIV E-OR 
gate. 


Binary multiplier (output is 
1 iff all inputs are 1). Also 
called an AND gate. 


*Strictly speaking this should be called a half-adder, since there is no carry. 
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Thus an element of GF(2*) generated by a*+a+1=0 (see Fig. 3.1) is 
represented by 4 0’s and 1’s, which can be stored in a row of 4 storage 
elements (called a register): 


[o](o] 1) (11 contains the element 0011 © a? a°. 


To multiply by a. The circuit shown in Fig. 3.2, called a linear feedback shift 
register, multiplies the contents of the register by o in GF(2^). 


Fig. 3.2. 


For if initially it contains 


€» a. t aa + ag’ t asa, 


then one time instant later it contains 


D D U D e a(as-* aia + aa? + aa) 


a} àgtd3 ay az 


=q, (do + aja t aia? t asa. 


If initially this register contains 1000 < 1, then at successive time instants it 
contains 1, œ, 2”,...,@",a@°=l1,a,..., since a is primitive. So the output of 
the circuit in Fig. 3.3 is periodic with period 15. This is the maximum possible 
period with 4 storage elements (since there are just 2*~ 1 = 15 nonzero states). 
Segments of length 15 of the output sequence are codewords in a maximal- 
length feedback shift register code, which is a simplex code (see $9 of Ch. 1 
and Ch. 14). 


OUTPUT 


Fig. 3.3. 


To multiply by a fixed element. E.g. to multiply an arbitrary element ao - : + 
asa? by 1+ a’: 


(ao + aia + aœ? + asa’)(1 + a’) = ao + aia + (aot aja? 
(ai a3)a? + asa + asa? 


= (ao + aj) + (aı + as + asla + (ao + as + aja? + (ai + aya; 


which is accomplished by the circuit in Fig. 3.4. 
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do a4 02 03 
ao tae 
a, +02 taz 
dg taz tay 
a; taz 
Fig. 3.4. 


Similarly to divide by a fixed element, multiply by the inverse. 


To multiply two arbitrary field elements. Unfortunately (because this is an 
essential step, for example, in decoding BCH and Goppa codes, see Chapters 
9 and 12), this is considerably more difficult. 

Suppose we want to form the product 


c-a.b- (aot aia + ara? + aia )Y(bo + bia + boa’ + bia’) 


given a and b. 


Methods. (1) (Brute force.) If c = Co + cia + ca^ + cia! then 


Co = aabo + aibi + a-b; + aibi 
Ci = dob,  ai(bo b) + abs bs) * ai(bi + b). etc., 


and a large and complicated circuit is needed to form the c;'s from the a;'s 
and b;'s. 
(2) Mimic long multiplication as done by hand. Thus we write 


c = baa * bi(aa) * b((a^a) * baa). 


and use the circuit in Fig. 3.5. At each step, add a'a to c iff b; = 1, and then 
multiply a'a by a. 

Laws and Rushforth [796] have recently described a cellular array circuit 
which also multiplies in this way, but is iterative in space rather than in time, 
and so is faster than the above circuit. 

(3) Use log and antilog tables. To multiply two elements of GF(2*). take 
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«— MULTIPLIES BY a 


*«— CONTAINS a, qa, 
T r aĉa ,a3a 

ADD ala TO c —my Y t 

IFF bj zt! R 


4 FORMS c 


Fig. 3.5. 


their logarithms to base a (as in $2), where a is a primitive element of the 
field, add the logs as integers modulo 15, and take the antilog of the answer. 
Figure 3.6 shows this schematically. 


' ANTILOG -b 





Fig. 3.6. 


Unlike the logarithm of a real number, the logarithm in a finite field is an 
extremely irregular function. No good shortcut is known for finding loga, 
a € GF(25). Either one calculates it directly, by computing successive powers 
of a until a is reached (which is slow), or, better, a log table is used, as in Fig. 
3.1 above, or Figs. 4.1, 4.2. This method is fine for GF(2? but is not 
practicable for GF(2") if m is large, especially as an antilog table of the same 
length is needed. 

(4) Zech's logarithms (Conway [301]. In this scheme only the polar 
representation (i.e. as a power of a primitive element o) of the field elements 
is used. Multiplication is now easy, but what about addition? This is carried 
out by using Zech's logarithms. The Zech's logarithm of n is defined by the 
equation 

Il+a"=a*™ 
(see Fig. 3.7). Then to add a”, a": 
a" + a" = a" (1 + a") = oto my 
Thus the antilog table has been eliminated. For example, 


a^ +a’ = a(l +a’) = aaa". 
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n Z(n) n Z(n) 
— oc 0 7 9 Gives Z(n) where 
0 — oo 8 2 l +a" = a7™ 

1 4 9 7 

2 8 10 5 

3 14 11 12 

4 1 12 1l 

5 10 13 6 

6 13 14 3 


Fig. 3.7. Zech's logarithms in GF(2?). 


Notes on Chapter 3 


$2. The theorem that any polynomial over a field can be written uniquely as 
the product of irreducible polynomials may be found in any textbook on 
algebra - see for example Albert [19, p. 49] or Van der Waerden [1376, Vol. 1, 


p. 60]. 


83. The decoding scheme. Note that not all quadratic equations can be solved 
in GF(2*) - see Berlekamp [113, p. 243]. This decoding scheme is incomplete, 
for it doesn't correct those triple errors that the code is capable of 
correcting - see Ch. 9. Other references on computations in Galois fields are 
Beard [92], Levitt and Kautz [827] and Tanaka et al. [1300]. See also the 


Notes to Ch. 4. 


Finite fields 


81. Introduction 


Finite fields are used in most of the known construction of codes, and for 
decoding. They are also important in many branches of mathematics, e.g. in 
constructing block designs, finite geometries (see Appendix B), etc. This 
chapter gives a description of these fields. 

The field GF(2*) was defined in Ch. 3 to consist of all polynomials in x with 
binary coefficients and degree at most 3, with calculations performed modulo 
the irreducible polynomial m(x)- x*- x +1. This chapter will show that all 
finite fields can be obtained in this way. 


The fields GF(p), p = prime. The simplest fields are the following. Let p be a 
prime number. Then the integers modulo p form a field of order p, denoted by 
GF(p) or Z,. The elements of GF(p) are (0,1.2..... p — 1}, and +, —, X, + are 
carried out mod p. 

E.g. GF(2) is the binary field (0, 1}. GF(3) is the ternary field (0, 1, 2), with 
1+2=3=0 (mod 3), 2-2=4=1 (mod3), 1-2— —1-2 (mod 3), etc. 


Problems. (1) Check that GF(p) is a field. 
(2) Write out the addition and multiplications tables for GF(5) and GF(7). 


If a field E contains a field F we say that E is an extension of F. E.g. the 
field of real numbers is an extension of the field of rational numbers. 


Construction of GF(p"). The same construction that was used in Ch. 3 to 
construct GF(2*) from GF(2) will work in general, provided we know a 
polynomial 7 (x) which (i) has coefficients from G F(p), and (ii) is irreducible over 
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GF(p); i.e., is not the product of two polynomials of lower degree with 
coefficients from GF(p). (Such a polynomial will be described briefly as being 
irreducible over GF(p). Corollary 16 will show that such polynomials always 
exist.) 

E.g. x*x*1- (x *2Y is not irreducible over GF(3), but x?+x+2 and 
X^ t I are. 

If m(x) has degree m, we proceed as in Ch. 3 and obtain: 


Theorem 1. Suppose m(x) is irreducible over GF(p) and has degree m. Then 
the set of all polynomials in x of degree =m - 1 and coefficients from GF(p). 
with calculations performed modulo m(x). forms a field of order p". 


We shall see later (Theorem 6) that there is essentially only one field of 
order p". This is called a Galois field and is denoted by GF(p"). 

Any member of GF(p") can also be written as an m-tuple of elements from 
GF(p), just as we saw in Ch. 3. 

The members of GF(p") may also be described as residue classes (cf. $3 of 
Ch. 2) of the polynomials in x with coefficienis from GF(p), reduced: modulo 
a(x). Let a denote the residue class of x itself, i.e. the element 0100: - - 0 of 
GF(p"). Then from the construction of GF(p"), (a) = 0. Thus the equation 
T(x) «0 has a root a in GF(p"). We say that GF(p") was obtained from 
GF(p) by adjoining to GF(p) a zero of m(x). 

Then GF(p") consists of all polynomials in a of degree «m — 1, with 
coefficients from GF(p). An element of GF(p") is: 


dod, 7^ Am i ** At aa t 56: + d, 107 7, 


where a; € GF(p) and (a) - 0. 

This construction also works for infinite fields. For example, let Q denote 
the rational numbers. If we adjoin to Q a zero of the polynomial x?— 2 we 
obtain the field Q(V2) consisting of all numbers a + b V2, with addition and 
multiplication just as one would expect. Finite fields are a good deal simpler 
than infinite fields. 


Example. Using the irreducible polynomial a(x) = x? * x * 2 over GF(3), we 
obtain the representation of GF(3?) shown in Fig. 4.1. 


Problems. (3) Find binary irreducible polynomials of degrees 2, 3 and 5. List 
the elements of GF(8) and carry out some calculations in this field. 
(4) Find a polynomial of degree 3 which is irreducible over GF(3). 


Outline of chapter. 82 and $3 give the basic theory of finite fields. Any finite 
field contains p" elements, for some prime p and integer m = 1, and there is 
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x 


00 =0 =a 
10= 1 =a? 
0l= a =a' 
12=1+2a =a? 
22 =2+2a =a’ 
20 = 2 =a‘ 
02 = 2a — a^ 
2122-94 a=a 
ll=l+ a =g". 


Fig. 4.1. GF(3°) with m(a)=a°+a+2=0. 


essentially only one such field, which is denoted by GF(p") (Theorem 6). 
Furthermore such a field exists for every p and m (Theorem 7). Theorem 4 
shows that every finite field contains a primitive element, enabling us to take 
logarithms. 

Associated with each element of the field is an irreducible polynomial 
called its minimal polynomial. 83 studies these polynomials, which are impor- 
tant for cyclic codes. In 84 we show how to find irreducible polynomials, 
using the important formula (8). 

$5 contains tables of small fields and primitive polynomials. In §6 it is 
shown that the automorphism group of GF(p") is a cyclic group of order m 
(Theorem 12). 87 contains a formula (Theorem 15) for the number of ir- 
reducible polynomials. The last two $'s discuss different kinds of bases for 
GF(p") considered as a vector space over GF(p). In particular $9 gives a 
proof of the important (but difficult) normal basis theorem (Theorem 25). 


$2. Finite fields: the basic theory 


In Galois fields, full of flowers. 
Primitive elements dance for hours ... 
S.B. Weinstein 


The characteristic of a field. Let F be an arbitrary finite field, of order q say. 
F contains the unit element 1, and since F is finite the elements 1, 1 - 1 — 2, 
I+1+1=3,... cannot all be distinct. Therefore there is a smallest number p 
such that p 2 1-14: -- 1 (p times) — 0. This p must be a prime number (for 
if rs = 0 then r =0 or s — 0) and is called the characteristic of the field. The 
field GF(2*) constructed in Ch. 3 has characteristic 2. If F has characteristic p, 
then pg = 0 for any element B E F. 

Thus F contains the p elements 


0.1. 131-72. P+}41=3,..., It+---+]=p-l. 


The sum and product of such elements have the same form, so these p 
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elements form a subfield GF(p) of F. If q = p, there are no other elements in 
F, and F = GF(p). 


Other elements of F. Supposing q > p, we choose a maximal set of elements 
of F which are linearly independent over GF(p), say Bo= l, B,,.... Bs. 
Then F contains all the elements 


Goo + a,B,+- $t + a4 Bs, a, € GF(p), 


and no others. Thus F is a vector space of dimension m over GF(p), and 
contains q = p" elements for some prime p and some integer m = 1 (or, F has 
order p"). Let F* stand for the set of q—1 nonzero elements of F. 

The important special property of finite, as distinct from infinite, fields is: 


Theorem 2. F* is a cyclic multiplicative group of order p" —1. [A finite 
r-i 


multiplicative group is cyclic if it consists of the elements 1, a, a’,...,a"™', 
with a’ = 1. Then a is called a generator of the group.] 


Proof. That F* is a multiplicative group follows from the definition of a field. 
Let a € F*. Since F* has size p" — 1, a‘ has at most p" — 1 distinct values. 
Therefore there are integers r and i, with 1 = r « p" — 1, such that a'*' = a‘, or 
a' = 1. The smallest such r is called the order of a. 

Now choose a so that r is as large as possible. We shall show that the 
order / of any element 8 € F* divides r. For any prime 7, suppose r= m^r', 
| — 77°l', where r' and l’ are not divisible by m. Then a" has order r', 8" has 
order 7^, and a"'8" has order c*"r' (by Problem 7). Hence b <a or else r 
would not be maximal. Thus every prime power that is a divisor of / is also a 
divisor of r, and so / divides r. Hence every f in F* satisfies the equation 


x’ —1=0. This means that x’ — I is divisible by Iser» (x — 8). Since there are 
p" —1 elements in F*, rzp"—1]. But rxp"—], hence r-p"-—l. Thus 
Ilker- (x - B) =x?” '— 1, and the nonzero elements of F form the cyclic group 
a, 07,...,0" ^ ?, q""-' « , Q.E.D. 


Corollary 3. (Fermat's theorem.) Every element B of a field F of order p" 
satisfies the identity 


or equivalently is a root of the equation 
x" =x. 


Thus 


x" —x= J] x -8). (1) 
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If F is a field of order p", an element o of F is called primitive if it has 
order p" — 1 (cf. §2 of Ch. 3). Then it follows that any nonzero element of F 
is a power of a. A second corollary to Theorem 2 is: 


Theorem 4. Any finite field F contains a primitive element. 
Proof. Take o to be a generator of the cyclic group F*. Q.E.D. 
The following lemma is very useful. 


Lemma 5. In any field of characteristic p, 


(xty)? =x? +y. 


Proof. From problem 18 of Ch. 1, 


(x+y) = Y (Daye 


k=0 
where 
()- O7 
0 p 
Also if 1«Kk «p - l, 
()- 29-279 -Etn 
k ]1:2-5-*k 
z 0 (mod p) 


since the numerator contains a factor of p but the denominator does not. 
Q.E.D. 
E.g. in a field of characteristic 2, we have 
(x*yy)-x'-y (xty)t=x*4+y%, (x+y =x" +y"... 


We end this section with an application of Corollary 3. Using it in the field 
GF(p) we get the important result that 


b’-'=1 (mod p) (2) 


for any integer b which is not a multiple of p. This is the usual version of 
Fermat’s theorem, which implies for example, that 


2° = 1 (mod 7), 2'°= 1 (mod 11), 
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7|2:5-1-63,  11|2?—1- 1023. 


Problems. (5) Derivative of a polynomial over a finite field. If 
f(x)= Dax’, a €GF(p"), 


define the derivative 


Show that (i) (f(x)*sG)y-f(x)y-*g(Q)Y. G) F(x) g(x) = f(x) 86) + 
f G0g(xY. (iii) If (x — a * divides f(x), show that (x — a)! divides f(x)’. (iv) If f (x) 
has no multiple zeros in any extension field, then f(x) and f(x)' are relatively 
prime. (v) If p = 2, f(x)’ contains only even powers and is a perfect square. Also 
f(x)" = 0. 

(6) Suppose that F is a finite extension field of GF(p) which contains all 
the zeros of x^" — x: (i) Show that x^" — x has distinct zeros in F. (Hint: show 
that this polynomial and its derivative are relatively prime.] (ii) Prove directly 
that these zeros form a field. 

(7) Let G be a commutative group, containing elements g and h of orders r 
and s respectively. (i) Show that if g” = 1 then r Į n. (ii) Show that if r and s are 
relatively prime then gh has order rs. (iii) Show that if r= rır, then g” has 
order r. 

(8) The Euler -function (or totient function) g(m) is the number of 
positive integers «€ m that are relatively prime to m, for any: positive integer 
m. 
(i) Show 


where p runs through the primes dividing m. 
(ii) Show 


È o(d)= m, 


where the sum is over all divisors of m, including 1 and m. 

(iii) Prove the Fermat-Euler theorem: If a and m are relatively prime, then 
a* 7" 2] (mod m). 

(9) Let A, be the number of primitive elements in GF(p"). Show that 
An = e(p" — 1), where e is the Euler function defined in Problem 8. Hence 
show lim inf, Am/p” = 0. 
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$3. Minimal polynomials 


Fermat’s theorem (Corollary 3) implies that every element B of GF(q), 
q = p", satisfies the equation 
x*—x-0. (3) 


This polynomial has all its coefficients from the prime field GF(p), and is 
monic (has leading coefficient 1). But 8 may satisfy a lower degree equation 
than (3). 


Definition. The minimal polynomial over GF(p) of B is the lowest degree 
monic polynomial M (x) say with coefficients from GF(p) such that 


M(B) = 0. 


Example. In GF(2*), where the minimal polynomials have coefficients equal to 
0 or 1, we have: 


Element Minimal Polynomial 


x 
xl 
x'-x-4l 
x*-x'«l 
x!'-x'-c-x!-x-«1l 
x;-x-«l 


RRRR = D 


We shall see how to find minimal polynomials in §4. 

Properties of minimal polynomials. Suppose M(x) is the minimal polynomial 
of B € GF(p"). 

Property (M1). M(x) is irreducible. 


Proof. If M(x) = M,(x)Mo(x) with the degrees of M(x) and M(x) both >0, 

then M(B) = M,(8)M.(B) = 0 and so either MB) or M.(B) = 0, contradicting 

the fact that M(x) is the lowest degree polynomial with B as a root. 
Q.E.D. 


Property (M2). If f(x) is any polynomial (with coefficients in GF(p)) such that 
f(B) = 0, then M(x) | f(x). 


Proof. By dividing M(x) into f(x), write 
f(x) = M(x)a(x)+ r(x), 
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where the degree of the remainder r(x) is less than that of M(x). Put x = B: 
0-0-r(B). 


and so r(x) is a polynomial of lower degree than M(x) having B as a root. 
This is a contradiction unless r(x) ^ 0, and then f(x) is divisible by M(x). 
Q.E.D. 


Property (M3). 
M(x)|x"" -x 


Proof. From (M2) and Corollary 3. Q.E.D. 


Property (M4). deg M(x) x m. 


Proof. GF(p") is a vector space of dimension m over GF(p). Therefore any 
m +1 elements, such as 1, B,..., B”, are linearly dependent, i.e., there exist 
coefficients a; € GF(p), not all zero, such that 


Thus 


is a polynomial of degree <m having B as a root. Therefore deg M(x) x m. 


Q.E.D. 


Property (M5). The minimal polynomial of a primitive element of GF(p") has 
degree m. Such a polynomial is called a primitive polynomial. 


Proof. Let B be a primitive element of GF(p"), with minimal polynomial M(x) 

of degree d. As in Theorem 1 we may use M(x) to generate a field F of order 

p^. But F contains 8 and hence all of GF(p"), so d >m. By (M4) d^ m. 
Q.E.D. 


Note. If an irreducible polynomial m(x) is used to construct GF(p") and 
a € GF(p") is a root of m(x), then obviously m(x) is the minimal polynomial 
of a. 

We can now prove the uniqueness of GF(p"). 
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Theorem 6. Ail finite fields of order p" are isomorphic. [Two fields F, G are 
said to be isomorphic if there is a one-to-one mapping from F onto G which 
preserves addition and multiplication.] 


Proof. Let F and G be fields of order p", and suppose æ is a primitive 
element of F with minimal polynomial M(x). By (M3), M(x)| x?" — x. There- 
fore from Corollary 3 there is an element B (say) of G which has minimal 
polynomial M (x). Now F can be considered to consist of all polynomials in a 
of degree =m — 1, i.e. F consists of polynomials modulo M(x). Furthermore 
G contains (and therefore consists of) all polynomials in B of degree x m — 1. 
Therefore the mapping a €» B is an isomorphism F @ G. Q.E.D. 


For example, consider the two versions of GF(2’) shown in Fig. 4.2, one 
defined by w(x) = x' xx +1, the other by a(x) 2 x * x^« I. 


defined by x? - x +1 defined by x' * x^*1 
000 = 0 000 = 0 
100 = | 100 = 1 
010— a 010— y 
001 = a? 001 = y? 
110 = o? 101 = y? 
O11 = a* Illy! 
ilia 110-2 4^ 
101 = a* 011 y* 


(a — 1) (y 7i) 
Fig. 4.2. Two versions of GF(2’) 


Then a and B = y! both have minimal polynomial x? * x +1, and a €» y! is an 
isomorphism between the two versions. For example 1 t a^ — o* in the first 
version becomes 1+(y*)’=(y*)° in the second version. We take the first 
version as our standard version (see Fig. 4.5). 
Finite fields can also be represented as irreducible cyclic codes — see Ch. 8. 
Conversely, GF(p") always exists: 


Theorem 7. For any prime p and integer m > 1, there is a field of order p". 
which is denoted by GF(p"). (By the previous theorem, this field is essentially 
unique.) 


Proof. For m = 1, GF(p) = Z, was defined in 81. Suppose then that m > 1, and 
set F, = GF(p). The idea of the proof is to construct a sequence of fields until 
we reach one, F, say, which contains all the zeros of x^" —x. Then from 
Problem 6 the zeros of x"" — x in F, form the desired field of order p". Let 
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f(x) be an irreducible factor of degree = 2 of x?" — x over F, (if there is one), 
and use the construction of Theorem 1 with (x) = f(x) to obtain a new field 
F,. Let f;(x) be an irreducible factor of degree = 2 of x?" — x over PF, (if there 
is one), and again use the construction of Theorem 1 with m(x) = f(x) to 
obtain a new field F,. After a finite number of steps we arrive at a field F, 
which contains all the zeros x^" — x. This completes the proof of the theorem. 


Alternative proof. From Corollary 16 below there exists an irreducible 
polynomial of degree m over GF(p). The theorem then follows from the 
construction of Theorem 1. Q.E.D. 


Problem. (10) Show that the mapping y > y^ is an isomorphism from GF(p”) 
onto itself. 


Subfields of GF(p"). We could iterate the construction of Theorem 1 (as we 
did in the proof of Theorem 7). First, obtain the field GF(p") from GF(p) by 
adjoining a zero of a polynomial m(x) of degree m which is irreducible over 
GF(p). Now let f(x) be a polynomial of degree n which is irreducible over 
GF(p"). Form a new field from GF(p") by adjoining a zero of f(x). The 
argument used in Ch. 3 now shows that this new field has p"" elements. 
Hence by Theorem 6 it is GF(p""). So iterating the construction doesn't give 
any new fields. 

But going from GF(p) to GF(p"") in two steps does allow us to deduce the 
following useful theorem.. 


Theorem 8. (i) GF(p') contains a subfield (isomorphic to) GF(p") iff s divides 
r. (ii) If B € GF(p’) then B is in GF(p^) iff B" = B. In any field if B? = B then B 
is 0 or l. 


The proof requires a lemma. 


Lemma 9. If n, r, s are integers with n 22, rzl, s > 1, then 
n'-ln'-1 if sl|r 


(Recall that the vertical slash means "divides".) 


Proof. Write r= Qs + R, whereOx R < s. Then 











Now n® — 1 is always divisible by n' — 1. The last term is less than 1 and so is 
an integer iff R — O0. Q.E.D. 
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Problem. (11) (i) Show that in any field 
x'-1|x'-1 iff sr. 
(ii) Show that g.c.d. (x' - 1, x* — 1} = x^ — 1, where 
d = g.c.d. (r. s). 


Proof of Theorem 8. (i) If s | r then from Problem 6 GF(p’) contains a subfield 
isomorphic to GF(p^). For the converse. let 8 be a primitive element of 
GF(p"). Then 

pre 6” '=1. 


So p’—1|p’—1, and s |r by the lemma. 
(ii) The first statement is immediate from Corollary 3, and the second 
statement is obvious. Q.E.D. 


For example, the subfields of GF(2") are shown in Fig. 4.3. 


Fig. 4.3. Subfields of GF(2"). 


Conjugates and cyclotomic cosets. 


Property (M6). B and B" have the same minimal polynomial. In particular, in 
GF(2"), B and B^ have the same minimal polynomial. 


Proof by example. Suppose 8 € GF(2*) has minimal polynomial x*+ x? * 1. 
Then 


(gY-(8ye-l-(g'-8 -1» by Lemma 5 
=0. 
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So by (M2) the minimal polynomial of 8? divides x*-- x' ^ 1. But (8?)* = B, so 
we can use the same argument to show that the minimal polynomial of 8 
divides that of 8°. Therefore they are equal. Q.E.D. 


Elements of the field with the same minimal polynomial are called con- 
jugates. (This is the reason i and — i are called conjugate complex numbers — 
both have minimal polynomial x?+1 over the reals.) 

Let's look at what happens in GF(2*). By (M6), the following elements all 
have the same minimal polynomial: a, a?, (a°? = a^, (a? = aè (and (a?! =a 
again). Likewise a’, a^, a", a” — a? (and a'*= a? again) all have the same 
minimal polynomial, and so on. We see that the powers of o fall into disjoint 
sets, which we shall call cyclotomic cosets. All a! where j runs through a 
cyclotomic coset have the same minimal polynomial. 


Definition. The operation of multiplying by p divides the integers mod p" — 1 into 
sets called the cyclotomic cosets mod p" — 1. 


The cyclotomic coset containing s consists of 


{s, ps, p^s, p*s,...,p™ 's} 


where m, is the smallest positive integer such that p™.s =s (mod p" — 1). 
E.g. the cyclotomic cosets mod 15 (with p — 2) are: 


Co= {0}, 

C, = {1, 2, 4, 8}, 
C; = (3,6, 12.9), 
C. = (5, 10}, 


C; 7 (7,14, 13, 11). 


Our notation is that if s is the smallest number in the coset, the coset is called 
C.. The subscripts s are called the coset representatives mod p" — 1. 


Problems (12) Verify the cyclotomic cosets for p = 2 shown in Fig. 4.4. 
(13) Find the cyclotomic cosets mod 8 and 26 (p = 3). 


Definition of M‘(x). Let M(x) be the minimal polynomial of a' € GF(p"). 
Of course by (M6) 
M(x) = M(x), (4) 


From the preceding discussion it follows that if i is in the cyclotomic coset C,, 
then in GF(p") 


[] (- a’) divides M'^(. (5) 


JEC, 


Minimal polynomials 


mod 31 


= {0} 
C, = {1, 2.4, 8, 16} 


C; = (3, 6, 12, 24, 17} 


C. = (5,10, 20, 9, 18} 
C,= {7, 14, 28, 25, 19} 
C, = {11, 22, 13, 26, 21} 
Cis = (15,30, 29, 27, 23} 
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mod 7 
Co = {0} Co 
C, = {1, 2, 4} 
C: = (3, 6, 5} 
mod 127 
Co={ 0} 


Co = {19 


64} 
65} 
66} 
67} 
68} 
69} 
70} 
71} 
73} 


Cru = {21 
Cn 2 {23 
Ca = {27 
Cr = {29 
Cx = {31 
Ca = {43 
Ca m (47 


Css = {55 110 
Cos = (63 126 


105 


mod 63 





Co = {0} 
C, = {1, 2, 4, 8, 16, 32} 
C; = (3,6, 12, 24, 48, 33} 


C; = (5, 10, 20, 40, 17, 34) 
C; = {7, 14, 28, 56, 49, 35} 


C, = (9,18, 36} 


Cu = {11, 22, 44, 25, 50, 37} 
Cn = (13, 26, 52, 41, 19, 38} 
C « = (15, 30, 60, 57, 51, 39} 


Cx = {21, 42} 


Cru = (23, 46, 29, 58, 53, 43} 


Cx = {27, 54, 45} 


Cx = {31, 62, 61, 59, 55, 47} 


mod 127 (cont.) 


84 41 82 
92 57 114 
89 51 
105 83 
121 115 
90 53 
122 117 
59 118 
123 119 


Fig. 4.4. Cyclotomic cosets mod 7, 31, 63 and 127. 


74) 
75} 
77 
78) 
79) 
85} 
87} 
91} 
95} 


Problem (14) Show that the coefficients of the LHS of (5), which are the 
elementary symmetric functions of the a^'s, are in G F(p). [Hint: use Theorem 
8(ii).] Conclude that the LHS of (5) is equal to M'(x). 


Property (M7). If i is in C, then 


M^q)- [Io 725. 


Furthermore from Equation (1) 


x" -1-2 [| M), 


E 


where s runs through the coset representatives mod p" — 1. 


(6) 


(7) 





106 Finite fleids Ch. 4. $3. 


Problems. (15) When p = 2 and n = 2" — 1, show that C, x C; and |C,| =|C,| = 
m, provided m > 3. Hence deg M(x) = deg M? (x) = m. What about C;? 

(16) If a(x) is an irreducible polynomial over GF(q), say that it belongs to 
exponent e if all its roots have order e. Show that this implies a(x) | x°— 1 but 
a(x) \x"-—1 for n<e. 

(17) Let 8 € GF(q) have minimal polynomial M(x). Show that deg M(x) = 
d iff d is the smallest positive integer such that 8*" = f. 

(18) If 


f(x) = I (x—a'*) 


show that 


f(x) = [M*()]* 


where d = m/m,. 


Representation of the Field by Matrices. The companion matrix of the 
polynomial a(x) = aos - a,x t: a, ax" ' * x' is defined to be the r x r matrix 


0 1 0 0 

0 0 1 0 

S| ge Oe rur ese. UL 
— do — a, — az * 7 a 


Problems. (19) Show that the characteristic polynomial of M, i.e., det (M — 
AI), is equal to a(A). Hence deduce that a(M) — 0. 

(20) Let M be the companion matrix of a primitive polynomial a(x) of 
degree m over GF(q). Show that M'=I if i-q"—1, M'£I for l<i< 
q" — 1. Deduce that the powers of M are the nonzero elements of GF(q"). 


For example, take a(x) = x' -x * 1 over GF(2). The elements of GF(2)) 
can then be represented as: 


0 M M M M M M M 


000\ /O10\ /001\ /110\ /011\ /111V /101\ /100 
000} (OO! | |110] [011] |111} {10:]{100]{010], 
000/ \110/ \O11/ M17 MOH \100/ \010/ \001 


where addition and multiplication in the field correspond to addition and 
multiplication of these matrices. 

This is a very laborious way of describing the field. It is less trouble to 
write the first row of each matrix as a polynomial in a, i.e., M S a, M? e o, 
M? © I+a etc., and perform multiplication modulo a? +a +1. Of course this 
gives the same representation of the field that we had in Theorem 1. 
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84. How to find irreducible polynomials 
The first two theorems present the key formulas. 


Theorem 10. 


x?" — x = product of all monic polynomials, 
irreducible over GF(p), whose 
degree divides m. (8) 


Proof. (i) Let m(x) be an irreducible polynomial over GF(p) of degree d, 
where d | m. The case m(x) = x is trivial, so assume a(x) # x. If we use m(x) 
to construct a field, then a(x) is the minimal polynomial of one of the 
elements and c(x)|x^'"'—1 from (M3). From Lemma 9 and Problem 11, 
p'—-1|p"—1, and x^^'— I|x^" '— 1. Therefore m(x)| x?” — x. 

(ii) Conversely let m(x) be a divisor of x^" — x, irreducible and of degree d. 
We must show d | m. Again we can assume m(x) # x, so that m(x)|x°" '— 1. As 
in part (i) we use (x) to construct a field F of order p^. Let a € F be a root 
of w(x) and let 8 be a primitive element of F, say - 


B-a.taact:::ta aaa. (9) 


Now (a) =0, so a^" =a and from (9) and Lemma 5 B”” = B. Thus B° = 1, 
so the order of B, p" — 1, must divide p" — 1. Therefore d | m from Lemma 9. 


Q.E.D. 
A similar argument shows: 
Theorem 11. For any field GF(q), q = prime power, 
x^" — x = product of all monic polynomials, 
irreducible over GF(q), whose 
degree divides m. (10) 


We use Equation (7) and Theorem 10 to find irreducible polynomials and 
minimal polynomials. For example when q —2 we proceed as follows. 


m — ]: Theorem 10 says 
X tx -x(x-l) 
There are two irreducible polynomials of degree 1, namely x and x +1. The 


minimal polynomials of 0 and 1 in GF(2) are respectively x and x +1. 


m —2: 
x"-x-x't-x-x(x-BD(x-x-c0D. 
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There is one irreducible polynomial of degree 2, x? x 4*1. The minimal 
polynomials of the elements of GF(2?) are: 


element minimal polynomial 
0 x 
1 M'(x) 2 x *l 


a, a? M" (x) = M^(x) 2 x +a +1 


xP tx =x +x = x(x +x +x +x +x tx -xel) 
= x(x + 1)\(x?+x + 1)X(x? +x? +1). (11) 
There are two irreducible polynomials of degree 3, namely 
x?’+x+1 and x°+x74+1. 


In GF(2’) defined by a?°+ a +1=0 we have: 


element minimal polynomial 
0 x 
1 M(x)=x41 
a, a), a* M'"(x) = M?(x) = M%x) = x -x €1 


o^,aS$,à0 M(x) = M(x)  MO(x)2 MO%(x) = x-x'*1 
Observe that (11) does agree with Theorem 10: m = 3 is divisible by 1 and 3, 


and x? +x is the product of the two irreducible polynomials of degree 1 and 


the two of degree 3. 
x?+x+1 and x°+x?+1 are called reciprocal polynomials. In general the 


reciprocal polynomial of 


aX" + An ax" +--+ axe 


AX" tax" 't::: ta, X +a 


obtained by reversing the order of the coefficients. Another way of saying this 
is that the reciprocal polynomial of f(x) is 


x9esfon f(x 71), 


The roots of the reciprocal polynomial are the reciprocals of roots of the 
original polynomial. The reciprocal of an irreducible polynomial is also 
irreducible. 

So if a has minimal polynomial M“(x), we know immediately that a~' has 
minimal polynomial M“'(x) = reciprocal polynomial of M®(x). 


m — 4: We know these factors of x" x: x, x+1, x! -x * 1 (irreducible of 
degree 1 and 2); x*+x +1 and its reciprocal x+ x°+ 1 (irreducible of degree 
4). By division we find the remaining irreducible polynomial of degree 4: 
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x*-x!-x'-x-1, and so 


x" Ex 2 x(x - DG - x DG x DOH x3 DG*-x*x!*-x-l). 


In GF(2*) defined by a*+a+1=0 we have: 


element minimal polynomial 
0 x 
1 M°(x)=x4+1 
a, a), a^, a? M'"z M?= M®= M® = xi+x +1 
a^, a5, a", a? M?=M°=M™=M%=x4 4x3 +x 4x41 
a), a? M°?=M'%= x7? 4x41 
a^, a a? a" M?"-M"-M'^—M'^—MC^-2x*-x«1 


Theorem 4 implies that the polynomial m(x) used to generate the field may 
always be chosen to be a primitive polynomial. (Simply take m(x) to be the 
minimal polynomial of a primitive element.) As we saw in Fig. 3.1, this has the 
advantage that an element of the field can be written either as a polynomial in 
a of degree x m — l, or as a power of a, where a is a zero of m(x). 


But it is a lot harder to find which of the irreducible polynomials are 
primitive: see $3 of Ch. 8. 


Problem. (21) Show that 
x" x = x(x- DG x Dog x - Do -x*- xxl) 

ext txt txt (xx tx $e x4 DOH x 474441) 
and that in GF(2°) defined by a*°+a°+1=0: 


element minimal polynomial 
0 x 
1 Mx) =x +1 
a, a^, a^, a*, a^ M'"(x)-x' cx'*1 
a), a^, a", a a” M(x) =x +x +x +x +l 
aña" a7 a, a" M(x) =x +x44x7 +x 41 
aia a7, a7, a7? M"(x)-x'v-x'r-x'tx«l 
a", a” aa, a?! M'"(x)-x'-x'-ex'-x-l 
a“, a= a, 7 aU a” M(x) = M'"(x)2x'-x'«*1 


$5. Tables of small fields 


Figure 4.5 gives the fields GF(2). GF(2), GF(2’). GF(2). GF(2°), GF(3), and 
GF(3’). The first column gives the element y as an m-tuple, while the second 


gives the logarithm i, where y = a’. (GF(2*) and GF(3°) will be found in Figs. 
3.1 and 4.1.) 
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GF(2) GF(2)) GF(2) 
log T(a)2 a^ -*«-1-70 T(a)2a)^-*a-*1-20 
— œ element log 
0 000 -x 
100 0 
010 1 
001 2 
110 3 
011 4 
111 5 
10! 6 
GF(2) with 7(a) = a°+a?+1=0 
element log element log 
00000 — © 11111 15 
10000 0 11011 16 
01000 1 11001 17 
00100 2 11000 18 
00010 3 01100 19 
00001 4 00110 20 
10100 5 00011 21 
01010 6 10101 22 
00101 7 11110 23 
10110 8 01111 24 
01011 9 10011 25 
10001 10 11101 26 
11100 11 11010 27 
01110 12 01101 28 
00111 13 10010 29 
10111 14 01001 30 
GF(25) with m(a) = a°+a+1=0 
element log element log element log 
000000 — oo 110111 21 010111 42 
100000 0 101011 22 111011 43 
010000 1 100101 23 101101 44 
001000 2 100010 24 100110 45 








Fig. 4.5. Some Galois fields. 
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GF(25) with m(a)= af+a+1=0 














element log element log element log 
000100 3 010001 25 010011 46 
000010 4 111000 26 111001 47 
000001 5 011100 27 101100 48 
110000 6 001110 28 010110 49 
011000 7 000111 29 001011 50 
001100 8 110011 30 110101 S1 
000110 9 101001 31 101010 52 
000011 10 100100 32 010101 53 
110001 11 010010 33 111010 54 
101000 12 001001 34 011101 55 
010100 13 110100 35 111110 56 
001010 14 011010 36 011111 57 
000101 15 001101 37 111111 58 
110010 16 110110 38 101111 59 
011001 17 011011 39 100111 60 
111100 18 111101 40 100011 6l 
011110 19 101110 41 100001 62 
001111 20 
3 
GEG) GF(3)) 





T(a)-a?^*2a*1-20 
element log element log 





8 
I 





COIADMAWN—O 2 





Fig. 4.5. (cont.) 
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Problem. (22) Of course GF(2°) contains GF(2°), from Theorem 8. Let £ be a 
primitive element of GF(2°). Show that GF(2’) = (0, 1, £8, £'5, £7, £5, £*5, E°). 
Show that if £5 + £ + 1 = 0 then to get the first version of GF(2’) (shown in Fig. 
4.2) we must take the primitive element of GF(2’) to be €”, £? or £^. 


Figure 4.6 gives a short list of primitive polynomials, chosen so that the 
number of terms in the polynomial is a minimum. For binary polynomials only 
the exponents are given. E.g. the fifth line 520 means that x°+x7+4+1 is 


primitive. 
Over GF(2) 
exponents of terms 
1 0 
2 1 0 
3 1 0 
4 10 
5 2 0 
6 1 0 
7 1 0 
8 6 5 4 
9.4 0 
10 3 0 
l1 2 0 
12 7 4 3 
13 4 3 1 
14 12 11 1 
15 1 0 
16 5 3 2 
3 0 
7 0 
6 5 
3 0 


oo 


Over GF(3) 


x+l 
x’+x+2 
x?’+2x +] 
x*+x+2 
x°+2x+1 
x5-x-42 


Over GF(5) 


x+1 
x°+x+2 
x°+3x+2 
x'-x!-2x 42 


Over GF(7) 


x-4l 
x!'-x-43 
x?+3x+2 


Fig. 4.6. Some primitive polynomials over GF(p). 


86. The automorphism group of GF(p") 


Associated with the field GF(p") is the set of mappings, called automor- 
phisms, of the field onto itself which fix every element of the subfield GF(p), 
and preserve addition and multiplication. Such a mapping will be denoted by 


c: B>B”. 


Ch. 4. §6. Automorphism group of GF(p™) 113 


Definition. An automorphism of GF(p") over GF(p) is a mapping o which 
fixes the elements of GF(p) and has the properties 


(i) (a+ B) =a’ +B’ 

(ii) (aB)* = aB”. 

The set of all automorphisms of GF(p") forms a group if we define the 
product of ø and by 


a7 zc (a?y. 


This group is called the automorphism group or Galois group of GF(p"). 


Example. The field GF(4) = (0, 1, a, a°}, with a? =a+1, a? = 1. The automor- 
phism group of GF(4) consists of the identity mapping | and the mapping 


2 2 
o: 11, ara’, a >a. 


Clearly o = 1. 


Theorem 12. The automorphism group of GF(p™) is the cyclic group of order 
m consisting of the mapping o,: B > B” and its powers. 


Proof. Using Lemma 5 it is clear that ø, and its powers are automorphisms of 
GF(p"). Let o be a primitive element and ø an automorphism of GF(p"). 
From the definition of an automorphism, a and a” have the same minimal 
polynomial. By Problem 14, a" is one of o,a^,o0",...,a"" . But if 
a* =a", then o = (o, ). Q.E.D. 


This theorem shows that in a finite field of characteristic p every element 
has a unique p" root. We shall occasionally use the fact that every element of 
GF(2") has a unique square root, given by V(B) = 8" ' 

If pz 2 exactly half the nonzero field elements have square roots. These 
elements are the quadratic residues, mentioned in Ch. 2. If œ is a primitive 
field element the quadratic residues are œa”, the even powers of a. It is now 
obvious that they form a group and that property (Q1) of Ch. 2 holds, namely: 


residue : residue — residue, 
nonresidue : nonresidue = residue, 
residue - nonresidue = nonresidue. 


We can also prove property (Q2), namely: 


Theorem 13. If p" = 4k +1, then — 1 is a quadratic residue; if p" — 4k — 1, —1 
is a nonresidue. 
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Proof. a”? =—1. If p" — 4k +1 then p" — 1) 22k, and —1 is an even 
power of a; while if p" 2 4k —1 then Xp" —1) 2 2k — l, and — l is an odd 
power of a. Q.E.D. 


Problems. (23) Prove Dedekind's theorem: if g,,..., n are distinct automor- 
phisms of a field F, then it is impossible to find element$ a,,..., an not all 0, 
in F, such that a,B*'+---+a,68% =0 for all BEF. 

(24) Regard o, as a linear transformation and let T be the matrix of this 
transformation. Show that f(x) 2 x"— 1 is the least degree polynomial such 
that f(T) =0. 


$7. The number of irreducible polynomials 


Let L,(m) be the number of monic polynomials of degree m which are 
irreducible over GF(q), where q is any prime power. This number can be 
expressed in terms of the Móbius function, which is defined by: 

l ifn=1 
a(n)= - 1)’ if n is the product of r distinct primes 
0 otherwise 


The fundamental properties of the Móbius function are given by 


Theorem 14. 


: _fl ifn=l1 
() 5 md) fi ifn >I 
(ii) The Möbius inversion formula: 


if f(n)- > g(d) for all positive integers n. 


then g(n)— > u(d)f(nid). 


(iii) y(n) = 2 du(nld) forniti. 


Problem. (25) Prove the above theorem. 
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Then we have: 


Theorem 15. 


L(m) ^ 5 È eda. 


Proof. From Theorem 11, 
m= 2, dl, (d 
q z (d). 
The result then follows by Móbius inversion. Q.E.D. 


E.g. when q =2, Theorem 15 gives L(1) = 2, L(2) = 1, 1,03) = 2, L(4) = 3, in 
agreement with the calculations above. 


Corollary 16. I,(m)z 1 for all q, m. 


Proof. To get a lower bound from Theorem 15, replace a(d) by —1 for d » I. 
Then L(m)»(q"—q"'—-:-— pm >0 by a simple calculation. Q.E.D. 


Thus there is a polynomial of degree m, irreducible over GF(p), for all p 
and m, giving the second proof of Theorem 7. 


88. Bases of GF(p") over GF(p). 


GF(p") is a vector space of dimension m over GF(p). Any set of m 
linearly independent elements can be used as a basis for this vector space. 

In constructing GF(p") from a primitive irreducible polynomial m(x), we 
used the basis 1, a, a’,..., «"^', where a is a zero of m(x). However there 
are other possibilities. 


Trace 


Definition. The sum 


T.(B)- B-Bre«g' pr e p" 


J 
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is called the trace of 8 € GF(p"). Since 


i 
T.(By = 2 B" = T9, 
£ 
from Theorem 8 T,,(B) is one of 0, 1, 2,..., p — 1 (i.e., an element of GF(p)). 
Thus the trace of an element of GF(2") is 0 or 1. 


Problem. (26) Show that the trace has the following properties. 

(i) T.(B + y) = Tm(B) + Taly), B, y € GF(p"). 

(ii) T,(8) takes on each value in GF(p) equally often, i.e., p 
[Hint: from Problem 23, T, is not identically zero.] 

(ii) T,(8") = Tn(B)’ = T.(G). 

(iv) T4, CI) = m(mod p), 

(v) Let Tm(x) be the polynomial 


times. 


For s € GF(p), show 


(vi) If M(z) =z’ + M,.,z"'+--- is the minimal polynomial of 8 € GF(p"), 
show that T4,(B) = — mM, lr. 


Lemma 17. (The Vandermonde matrix.) The matrix 


l a, a? ai at 
l a; az a a; 
l a, a} a} --- ds 


(where the a;'s are from any finite or infinite field), is called a Vandermonde 
matrix. Its determinant is 


which is nonzero if the a;'s are distinct. 
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Proof. It is easy to show that the determinant is equal to 


(a»—4)) (a2— a)a; (as— aı)a? +++ (as— a)a$ 7 
det] (a.— a1) (as a)as (as— a)a$ +--+ (as-aga$ ^ 
(Gn — G1) (a, —a)a, (an — aija} +++ (a,— aan”? 


The result then follows by induction. 
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Q.E.D. 


*The complementary basis. In the remainder of this section we restrict 


ourselves to fields of characteristic p = 2. 


Two bases Bi, B;,..., Bm and Aj, Às,..., Àm are said to be complementary 


if 
T.(BA;)) "0 forizj, 
Tm(BA:) = 1. 


These are also sometimes called dual bases. 
Lemma 18. Let Bi, B2,..., Bm be a basis. The matrix 


B. Bi Bi ves" 
B» Bi s t B? 


is invertible, and det B = 1. 


Proof. Expressing the basis 1, æ, ..., o"! in terms of the basis Bi, B»... 


we have 
1 
Qa pi 
a? 1-cC] BR: 
a"! B. 
for some binary matrix C. Then 
1 1 1 
a a? a 
CB T ee a PON ON OPE OC SEN 
a”! (a? (a yn? 


*Starred sections can be omitted on first reading. 


(12) 


| Bms 
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By Lemma 17 this matrix is invertible, hence B is invertible. Now 


T..(B1) — Tm€BiB2) + ** Tm(BiBm) 
T,.(BiB2) Tm(B2) +++ Tm(B2Bm) 


Tn(B1Bm) Tm(B2Bm) :: * Tm (Bm) 


is a binary matrix, so has determinani 0 or 1. Thus det BB’ = (det B)’= 1, 
which implies det B = 1. Q.E.D. 


BB" = 


Problem. (27) If Bı, B2,..., Bm is a basis then T,,.(B;) = 1 for at least one Bi. 


Theorem 19. Every basis has a complementary basis. 


Proof. Recall that the inverse of a matrix [a;] is obtained by replacing a; by 
the cofactor of aj, divided by det [a;]. It is readily seen that B™ is of the form 


Ay A; Am 
Ai Ai AZ 
AU Ae ee AEFU 
where Aj, À;,... , Am are linearly independent over GF(2). 
Then the equation BB ^' = I shows that Bi, B2,..., Bm and Ai, Ao,...,Am 


are complementary bases. Q.E.D. 


Problem. (28) If y € GF(2") then 


y- » T yA) Bi. 


*$9. Linearized polynomials and normal bases 


Let q = p^ be a power of the prime p. 


Definition. A linearized polynomial over GF(q^) is a polynomial of the form 
h 
L(z) =D lz“, 
i20 


where l € GF(q‘), |, - 1l. Eg. z+z7+2* is a linearized polynomial over 
GF(2’) for any s. 
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Let the zeros of L(z) lie in the extension field GF(q"), m= s. 


Lemma 20. The zeros of L(z) form a subspace of GF(q™), where GF(q") is 
regarded as a vector space over GF(q). 


Proof. If 8,, B; are zeros of L(z) then so is Aii - Ao», for A; € GF(q), by 
Corollary 3 and Lemma 5. Q.E.D. 


Problem. (29) If r divides m, show that the zeros of L(z) which lie in GF(q’) 
form a subspace of GF(q"). 


Since L'(z) 2 lo, if |, z 0 the q" zeros of L(z) are distinct and form an 
h-dimensional subspace of GF(q"). 


Lemma 21. Conversely, let U be an h-dimensional subspace of GF(q"). Then 


L(z) = [] (2-8) 


BeU 
is a linearized polynomial over GF(q"), i.e. 
L(z 2z"-laztt ++ + loz. (13) 


Proof. Let Bo,..., Ba-ı € GF(q") be a basis for U over GF(q). By Lemma 18 
the matrix ; 


B. Bia ec Be 


is invertible. Thus there is a solution /y..., /,.; in GF(q") to the equations 
pi +5 ign - o, i-0,-..,h- I. 
Therefore Bo,..., B.-; are the zeros of the polynomial 
L(z)2z" + p lz”. 


Any linear combination of the B, i.e. any element of U, is also a zero of L(z). 
Hence 


L(z)= []@-8). Q.E.D. 
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Example. Let U be the subspace of GF(2’) consisting of the points 000, 100, 
010, 110. Then (from Fig. 4.5) 


[] (2-8) = 2(z- Iz - az - a°) = zo az +02 
BEU 
is a linearized polynomial over GF(2’). 


Normal bases. 


Definition. A normal basis of GF(p") over GF(p) is a basis of the form y, y", 


p" | 


ee, ae 


For example GF(2’) clearly has the normal basis a, a? over GF(2). We shall 
occasionally use a normal basis in future chapters, hence this section is 
devoted to proving that there always is such a basis. 

A linearized polynomial 

h 
L(z)= > iz?’ 
i=0 
will be called a p-polynomial if the coefficients l; are restricted to GF(p). 

By Lemmas 20 and 21 the zeros of a p-polynomial form a subspace M of 
GF(p") over GF(p) for some choice of m. They have the additional property 
that if u belongs to M so does u’ (since L(u)" = L(u^)). A subspace with 
this property will be called a modulus. If |; #0, L(z) has only simple zeros, 
and we suppose this is always the case. 

On the other hand, if M is a modulus, then by Lemma 21 Ipem (z — B) isa 
linearized polynomial L(z). Now M consists of a union of sets of the form 
(B. B". B"....). and so by Problem 14 the coefficients of L(z) are in GF(p). 
Thus we have: 


Lemma 22. Suppose L(z) is a linearized polynomial over GF(p"). The zeros of 
L(z) form a modulus iff L(z) is a p-polynomial. 


Example. From the table of GF(2*) in Fig. 3.1 it is easy to see that the 
elements 0, 1. o. œa’, a^, aë, a’, a” of GF(2*) form a modulus. The cor- 
responding p-polynomial is z* - z*+27+z. 

The ordinary product of two p-polynomials is not a p-polynomial. We 


define the symbolic product of two p-polynomials F(z) and G(z) to be 
F(z) € G(z) = F(G(z)). 


Problems. (30) Show that this multiplication is commutative, i.e. F(G(z)) = 
G(F(z)). 
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To the p-polynomial 
F(z)= > liz" 
we associate the ordinary polynomial 
f(z)- > liz'. 


(31) Show that the ordinary polynomial associated with F(z) ® G(z) is 
f(z). 2(2). 


Lemma 23. If a p-polynomial H(z) is a symbolic product F(z)® G(z) of 
p-polynomials then H(z) is divisible in the ordinary sense by F(z) and G(z). 
Conversely, if H(z) is divisible by G(z), then H(z) = F(z) ® G(z) for some 
F(z). 


Proof. First let f(z) = (1/z) F(z). Then H(z) = f(G(z)) . G(z), so is divisible by 
G(z), and similarly by F(z). Conversely, suppose H(z) = A(z). G(z). Then 
dividing H(z) symbolically by G(z) we obtain 

H(z) = F(z) @ G(z) + Rz) = f(G(z) . G(z) + R(z) 
where deg R(z) « deg G(z). But since H(z) = A(z). G(z) we must have 
R(z) - 0. Q.E.D. 


Example. Consider the p-polynomial of the last example. We have 
24744 2747=(27+2)@Q (242) =(24+724+1)(24+2) 
=(z°4+2°4+244+2°4+ 1)(27+2). 
If L(z) cannot be written in the form F(z)6G9 G(z) it is said to be 


symbolically irreducible. A zero p of L(z) is called a primitive zero if it is not 
a zero of a p-polynomial of lower degree. 


Theorem 24. A p-polynomial L(z) for which la #0 always has a primitive zero. 


Proof. Let 
L(z) = F(z)" €9 F(z)? 69 -- Q FAz)* (14) 


be the decomposition of L(z) into symbolically irreducible factors, where the 
F(z) are distinct and F;(z) has degree p™. Then L(z) has degree p", where 
m = em; Since |, z 0, L(z) has only simple zeros. 

If u is an element of GF(p^) there is a unique p-polynomial of smallest 
degree, U(z) say, which has pw as a zero; i.e. U(z) is the polynomial with 
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zeros 4, u^, p”,... and all linear combinations of these. Furthermore if u is 
a zero of L(z) then L(z) is divisible by U(z). Hence a zero of L(z) is a 
primitive zero if it is not a zero of any of the symbolic divisors of L(z). Using 
the principle of inclusion and exclusion the number of such zeros is 


poc Phe ape ae 


-» (1-5) 5s) (1-5) 
= pepe 
ý ( p' p^ Pp” 
>0. (15) 
Thus L(z) has a primitive zero. Q.E.D. 





Example. The p-polynomial 

x*«x*c x! ex = (X -x)6 G -x)6 (GP x), 
and x'-x is symbolically irreducible. Therefore the number of primitive 
zeros is 2((1— 5) = 4. Indeed, the primitive zeros are a, a’, a*, a?, where a*4 
a+1=0. 


Now let 
L(z) = 2?" cl, az" + o + hz? + loz, l € GF(p) #0, 


be any p-polynomial. By Theorem 24, L(z) has a primitive zero u. L(z) is the 
unique p-polynomial of lowest degree having u as a zero. The zeros of L(z) 
are the p" elements 


pm-t 


Eol ten!" teg" +: tb esca , 


where e; € GF(p), and they form a modulus M by Lemma 22. Clearly this 
modulus has the normal basis 


m-1 


WB eoe 


We can now prove the main theorem of this section. 


Theorem 25. (The normal basis theorem.) A normal basis exists in any field 
GF(p"). 


Proof. Choose L(z)7z"" —z. We know that the zeros of L(z) are all the 
elements of GF(p"). The preceding remarks show that GF(p") has a normal 
basis over GF(p). Q.E.D. 


Problem. (32) For characteristic 2, show that the complementary basis of a 
normal basis is also a normal basis. 
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Now suppose p is 2 and m is odd, so that x" + 1 has no multiple zeros. Let 


x™4+1= TT fe), deg f, = m, 


be the factorization of x" + 1 into distinct irreducible factors. By Problem 31, 
z" +z= F(z): Q Fiz). 


From (15), the number of primitive zeros of this polynomial, and hence the 
number of elements u which generate normal bases of GF(2"), is 


NQ, m) - 2^ [] (1-5) 


=f e-n 


which is odd. The number of normal bases is N (2, m)/m, which is also odd. 
Thus we have proved: 


Theorem 26. If m is odd, GF(2") has a self-complementary normal basis. 


Example. We have 
x'*-12(x-Do? -x-1) 
z"-z-(z-G8g(-z-z). 
Hence the number of elements which generate normal bases for GF(2’) over 
GF(2) is 2? — 1 = 3, and there is one such normal basis, i.e. a’, a^, a^ using the 
table of Fig. 4.5. This basis is self-complementary. 
We state without proof the following refinement of the normal basis 
theorem. 


Theorem 27. (Davenport.) Any finite field GF(p") contains a primitive element 
y such that y, y’,..., y" | is a normal basis. 


Corollary 28. GF(2") contains a primitive element of trace 1. 
Research Problem. (4.1) Find a simple direct proof of Corollary 28. 


Problem. (33) Show that the only symbolically irreducible p-polynomials over 
GF(2) of degree < 16 are: z, z^, z ez. z^ z' «cz, z-z' -z and z^«z'- z. 
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Notes on Chapter 4 


Good references on finite fields are Albert [19], Artin [30], Berlekamp [113, 
Ch. 4], Birkhoff and Bartee [153], Conway [301], Jacobson [687, Vol. 3], 
Peterson and Weldon [1040, Ch. 6], Van der Waerden [1376, Vol. 1]. and 
Zariski and Samuel [1454, Vol. 1]. Further properties will be found in 
Berlekamp et al. [131], Carlitz [242-246], Cazacu and Simovici [254], Daven- 
port [329], Daykin [340], Roth [1126], Zierler [1466, 1467] and Zierler and 
Brillhart [1469]. 


$2. The epigraph is from Weinstein [1395]. We are taking for granted two 
theorems. Let f(x) be a polynomial of degree n with coefficients in a field F. 
(1) If f(8) =0 for BEF then f(x) 7» (x - B)g(x) for some g(x) with coeffi- 
cients in F. (See for example p. 40 of [19] or p. 31, Vol. I of [1454].) (2) Hence 
f(x) has at most n zeros in F. For Problem 8 see Hardy and Wright [602]. 


$4. Berlekamp [113, p. 112] describes more sophisticated methods for finding 
minimal polynomials. 


$5. For more extensive tables of small fields, see Alanan and Knuth [17], 
Bussey [222, 223] and Conway [301]. Extensive tables of irreducible 
polynomials can be found in Alanen and Knuth [17], Green and Taylor [554], 
Marsh [917]. Mossige [973], Peterson and Weldon [1040] and Stahnke [1260]. 
See also Beard and West [93] and Golomb [524a]. Factoring polynomials over 
finite fields is discussed in Ch. 9. 


$6. For Problem 23 see e.g. [30, p. 25] or [687. Vol. 3, p. 25]. 


$7. For the Móbius function see Hardy and Wright [602]. Knuth [772. p. 36] 
gives the inverse of a Vandermonde matrix. See also Althaus and Leake [27]. 


$9. The usual proof of the normal basis theorem (see for example Jacobson 
[687, Vol. 3, p. 61]) requires much more background. The proof given here is 
due to Ore [1015]. For another proof see Berger and Reiner [107]. Linearized 
polynomials have been studied by Ore [1015, 1016], Pele [1034], and Berle- 
kamp [113, Ch. 11]. Theorem 27 is from Davenport [330] (see also Carlitz [243]). 
Lempel [814] shows that every field GF(2") has a self-complementary basis. See 
also Mann [910]. 


Dual codes and their weight 
distribution 


$1. Introduction 


The main result of the chapter, Theorem 1, is the surprising fact that the 
weight enumerator of the dual ¢* of a binary linear code € is uniquely 
determined by the weight enumerator of € itself. In fact it is given by a linear 
transformation of the weight enumerator of €. This theorem is proved in $2. 
If the same transformation is applied to the distance distribution of a 
nonlinear code, we obtain a set of nonnegative numbers with useful properties 
($5). In order to study nonlinear codes we need to develop some algebraic 
machinery, namely Krawtchouk polynomials (end of $2), the group algebra 
(83) and characters ($4). In $6 we return to linear codes and consider several 
different types of weight enumerators of nonbinary codes. For each of these 
there is a result analogous to Theorem 1, namely that the enumerator of the 
dual code @* is given by a linear transformation of the enumerator of @. 
Theorem 14 is a very general result of this type. However, the results for the 
complete weight enumerator (Theorem 10) and the Hamming weight enu- 
merator (Theorem 13) are the most useful. The last section ($7) gives further 
properties of Krawtchouk polynomials. 


82. Weight distribution of the dual of a binary linear code 


Recall (from $8 of Ch. 1) that if @ is a linear code over a finite field, the 
dual code €^ consists of all vectors u having inner product zero with every 
codeword of €. If € is an [n, k] code, €^ is an [n. n — k] code. 


Weight enumerators. As in Ch: 2, A; will denote the number of codewords of 
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weight i in €. We will call the polynomial 
>> Ax" iy! 


the weight enumerator of 6. and denote it by W,(x. y). 
Observe that there are two ways of writing W,(x. y): 


W, (x. y) => Aix” y! 
= 2, xt gettin (1) 


Here x and v are indeterminates, and Wi(x.y) is a homogeneous 
polynomial of degree n in x and y. It is frequently useful for W.(x. y) to bea 
homogeneous polynomial. But we can always get rid of x by setting x = 1, and 
still have a perfectly good weight enumerator 


W,.(1, y)= Wi(y) = Aiy’. (2) 


Likewise A; will denote the number of codewords of weight i in 6+. The 
weight enumerator of €' is ` 


We(x. y) 7 2, Aix” 'y' 
i20 


= > xtowmetyien (3) 


weet 


Examples. (i) Consider the even weight code (000. 011, 101. 110}, denoted by 
€. The dual €; is (000,111) (Problem 34 of Ch. 1), and the weight 
enumerators are: 
Wa(x, y) = x! - 3xy. 
Wax, y) 2 x! * y*. 
(ii) The code (00, 11}, denoted by @,, is self-dual: €: = €,, and 
Wa(x,y)-2 x ty. 
(iii) The [7.4.3] Hamming code %,. From $9 of Ch. 1, 
Wax, y) = x7 + Tx*y*+ Tx? y* + y^, 


Wiy(x, y) = x! + 7x  y*. 


MacWilliams theorem for binary linear codes. The main result of this chapter 
is that W,:(x, y) is given by a linear transformation of W,(x, y). We give first 
the version for binary codes (Theorem 1). 

Let F" be the set of all binary vectors of length n. This is a vector space of 
dimension n over the binary field F = (0, 1). 
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Theorem 1. (MacWilliams theorem for binary linear codes.) If € is an [n,k] 
binary linear code with dual code €^ then 


1 
Weil, Y) = e We(x t y.x — y). (4) 


where |€| = 2* is the number of codewords in €. Equivalently, 


n ] 1 < id i 
Y, Aix" ty =— Sy AG YY (x - yy. (5) 
k=O [«| ^ 
or 
) wt (Gu 1 n- wtf), wilu) 
D meyes c S (x+y) (x-y. (6) 
ueg. [«| uc 6 


Equations (4), (5), and (6) are sometimes called the MacWilliams identities. 


The proof depends on an important lemma. Let f be any mapping defined 
on F". We must be able to add and subtract the values f(u). but otherwise f 
can be arbitrary. The Hadamard transform f of f (see 83 of Ch. 2) is defined 
by 

flu) = > C D" fw) u€F*' (7) 


ver” 


Lemma 2. If € is an [n, k] binary linear code (i.e.. a k-dimensional subspace 
of F"), 


D fa) = gg 2 fw. (8) 
Proof. 
> fu) = 2: 2 CD f) 
22.0.5 (r^ 


Now if v € €', u-v is always zero, and the inner sum is |€|. But if vé €^ 
then by Problem 12 of Ch. 1, u - v takes the values 0 and 1 equally often, and 
the inner sum is 0. Therefore 


2, fq -|€| X fo. Q.E.D. 
Proof of Theorem 1. We apply the lemma with 
fo) = xe M M 
Then we have 
fu) = > (- Ly ty (9) 


rer” 
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Let u —(u-**u,), v —(vis** va). Then 


fq) = > (- pott FAP I] xi yn 


5. §2. 


idi 
-» 3e Ic vx yt (10) 
v170 v270 $970 i=] 
Just as 
AgboCo + Apdo, + agb ,Co T agb1C, T aboco + ai1boCi T aibiCo T ayb,c, 
= (ao + ay)(bo + bi)(Co + c3), 
so (10) is equal to 
n 1 
II > (- D*"x 1- "oT 
i=l w=0 
If u = 0, the inner sum is x + y. If u = 1, it is x — y. Thus 
fu) = (x + yy" (x - ye, (11) 
Then Equation (8) reads 
Sar eyes ral 2, ( (x y)" x — yn, Q.E.D. 
vee 


uc 


Examples. We apply Theorem | to the examples at the beginning of this 


section. 
(i) Wax, y) = x? + 3xy’, 


1 1 
4 We(x + y. x y)= 4IG + y) + 3x + yx - yy] 
TUE Y. 
which is indeed Wes(x, y). Again, 
1 1 
3 Wel +y, x - y) -5lG +yP+(x-y)] 


= x? +3xy? = Wax, y), 


illustrating that the theorem is symmetric with respect to the rôles of € and 


g`. 
(ii) We(x, y) 2 x? + y°. So 
1 1 
z We +y, x - y) 5l c yy +(x- yy] 
=x +y = Wax; y), 


which is correct since €, is self-dual. 
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Problems. (1) Verify that 
4 Wa (x + y, x — y)=x7+7x?y* = Wax y). 


(2) Show that Theorem 1 is symmetric with respect to the rôles of € and €", 
i.e., 


1 
We(x, y) = [e We(x + y,x — y). 
[Hint: Put u=x+y, v=x-— y in Equation (5) and remember that W is 
homogeneous.] 
(3) Show that the weight enumerator of the [n = 2" — 1, n — m,3] Hamming 
code Hm is 





1 n (n -1y2 at (n*1)y2 
na lO t cna ty) 7G yo 

(4) (a) If € is a code with minimum distance =3, show that there are iA, 
vectors of weight i — 1 at distance 1 from €, and (n .— i)A, vectors of weight 
i+1 at distance 1 from €. 


(b) Hence show that for a Hamming code, 


n n n-i 
> iAy''+> Ay'+ Dd (n - DAy'"" 20 yy, 
i-0 i-0 


i.e. the weight distribution satisfies the recurrence Ao=1, A, =0, 
(i + DA; + A; +(n lx: i + DA: = (5). 


(5) The extended Golay code ($6 of Ch. 2) is self-dual. Verify this by 
working out the RHS of Equation (4) for this code. 


Relationship between the A?'s and the A,’s. What is the relationship between 
the weight distributions (A?) and (A;) obtained from (5)? If we write 


(x yxy) = 2 P, (Dx"-*y* (12) 
then from (5) 
-l Sapri 
AL e 2, AP.Q). (13) 


P,(i) is a Krawtchouk polynomial. The formal definition of these poly- 
nomials follows. 
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Krawtchouk polynomials. 


Definition. For any positive integer n, the Krawtchouk polynomial P(x; n) = 
P,(x) is defined by 
k = 
Pam- co() AR k =0,1,2,... (14) 
7=0 j/ Nx - j 
where x is an indeterminate, and the binomial coefficients are defined in 
Problem 18 of Ch. 1. Thus P,(x; n) is a polynomial in x of degree k. If there is 
no danger of confusion we omit the n. 
From the binomial series (Problem 18 again) P,(x;n) has the generating 
function 


(14 2) *(1—zY = 9, P.Goz*. (15) 
k=0 
If i, n are integers with Ox i n this becomes 
(14 zy (10— z) = 9 P.()z*. (16) 
k=0 
The first few values are 
Pox) = 1, 
P(x) =n —- 2x, 


PAx)= (2) —2nx + 2x’, 


n 2 2 2 4x* 
P(x) = (9) - (n n eio 2n 3 
For further properties see $7. 
The Axs given by (13) are interesting even if € is a nonlinear code (cf. 
Theorems 6 and 8). But to deal with nonlinear codes we need some new 
machinery. 


* Moments of the weight distribution. Before proceeding with nonlinear codes 
we digress to give some other identities relating the weight distribution {A;} of 
an [n,k] code, and the weight distribution (A!) of the dual code. 

Our starting point is equation (5) with x = 1, rewritten as 


- i 1 ` t -i i 
2, Ay! o gx 22 AU E Y 0.7 yy (17) 
i-0 i-0 

Setting y = 1 gives (since A= 1) 


as it should. 
Differentiating (17) with respect to y and setting y —1 gives the first 
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moment: 
Se sane sn if A;=0 
ad 2 i 2 t . 


So if A; —0, the mean weight is $n. Continuing in this way we obtain: 
> (i) FF% C v( LA (18) 


for » = 0,1,...,a. The LHS of (18) is called a binomial moment of the A's. 
Now suppose the dual code has minimum distance d', so that 
Ai =A, =O. 
If v « d', the RHS of (18) no longer depends on the code: thus 


S (i\A;_ 1 fn 
x (i) Qe oF o um 
This will be used in the next chapter. Equation (19) shows that the v™ 


binomial moment, for vy 20,1,..., d'— 1, is independent of the code, and is 
equal to that of the [n, n, 1] code F”. 


Problems. (6) Prove the following alternative version of (18): 


S-a aues w (n Oy 
SC Ae a 
for v=0,1,. 
(7) If instead p differentiating (17) with respect to y we apply the operator 
y(d/dy) v times and set y = 1, we obtain the power moments of the A;'s. For 
PE d' these are particularly simple. Show that 


SP Hains, i2«d, (20) 
& 4 
and 

A uA; 1 «fn ; i 

AURI (7). if Ox v <d. (21) 


In general, show that for v —0,1,... 


» iA = > z yaf% NS(y, pie Q2) 
where 
«Mert 


is a Stirling number of the second kind (and the conventions for binomial 
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coefficients given in Ch. 1 apply). The equations (22) are frequently called the 
Pless identities. 

(8) Show that if there are only r unknown A;'s, and Ai,...,Aj., are 
known, then all the A;'s can be determined. [Hint: the coefficients of the LHS 
of (22) form a Vandermonde matrix - see Lemma 17 of Ch. 4.] 

(9) Moments about the mean. (a) Show that (17) implies 

Y Ae" 7" = 2* A; cosh" x sinhix. (23) 
j=0 j=0 


(b) Hence show that for r=0,1,... 


xd (E-i a=k S Fma (24) 


i-0 


where 


F°(n) = [+ cosh" ‘x sinh’x| 


=0 


(c) Show that F?(n) 20 and 


ETO) 


(d) Hence, for r=0,1,..., 


] «S /n y | wv /n Nn 
rà Gaar à GNC) as 
Note that the RHS of (25) is the r" moment about the mean of the weight 


distribution of the [n, n, 1] code F". 
(e) Show that the equality holds in (25) if r « d' — 1. 


Research Problem (5.1). Let € be a linear code. As in 85 of Ch. | let o; be the 
number of coset leaders of € of weight i, and a; the corresponding number 
for €*. To what extent do the numbers {a;} determine (a7)? 


83. The group algebra 


We are going to describe binary vectors of length n by polynomials in 
Z,..., Za For example, 100 - - - 0 will be represented by z,, 1010---0 by z,zs, 
and so on. In general v = v,v,::: v, is represented by z//z;::-: z;, which we 
abbreviate z”. Clearly if we know z” we can recover v. Thus Z,e4z" is just a 
very fancy weight enumerator for €. (We shall use this type of weight 
enumerator again in $6 for codes over GF(q).) We make the convention that 
z? = 1 for all i. This makes the set of all z” into a multiplicative group denoted 
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by G. Thus F" and G are isomorphic groups, with addition in F” 
vtw=(v,,..., Un) -(W,,..., Wn) = (V+W... On + Wr) 


corresponding to multiplication in G: 


vow vi Pewi Unt, vtw 


UO id] Wn 
Lek Ly Dy Ly Pe ZA eae 45 = Z 


Definition. The group algebra QG of G over the rational numbers Q consists 


of all formal sums 
> az’, a EQ, z”EG. 


veF" 


Addition and multiplication of elements of QG are defined in the natural way 


by 
Dd az°+ € bz’ = € (aebo. 
veF" ve F" ve F" 
r > a.z” = » raz”, rEQ 
ve F" veF" 
and 


> dg Ð hse © absz”. 


veF" weF" vwe F" 


QG gives us an algebraic notation for subsets of F", or codes: cor- 
responding to the code € C F" we have the element 


Cu zt 


uc 
of QG. For example, corresponding to the code (000, 011, 101, 110) we have 
C = 14 2234+ 2,23 + 2122. 


In general it may be helpful to think of the elements of QG as “generalized 
codes"; the coefficient a, being the number of times v occurs in the "code." 


Problem (10). For n —3 show that 


(zi + Z2 +23)? = 3 2(zizo + Z223; zzi). 


One of the nice things about the group algebra notation is that it provides a 
concise way of stating certain properties of codes. Let 


Y; = > z“ 


wt (u= i 
represent the vectors of weight i. For example, 
Yo a 1 
Y,=2,+2Z2+°°-+2Z,, 
Y2 = zizatziZstocc + ZnZn- 
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The sphere of radius e around v is then described by 
z'(Yo*- Yit: Y). 
If € is a perfect (n, M,2e + 1) code, this fact is expressed by the identity 
C:(Yor Yit c Y) Ñ z’, (26) 


ueF" 


where 


Example. For the perfect single-error-correcting code {000,111}, C= 
14+ 2.2.23, Yor Yi =1+2,+2%+ zi, and indeed 


C-(Yot Y) lt zi zo 23+ 222 + 21:23 + Z223 + 212223. 


Problem. (11) Describe the direct sum construction and the |u|u + v| con- 
struction ($9 of Ch. 2) using the group algebra. 


$4. Characters 


To each u € F” we associate the mapping x, from G to the rational 
numbers given by 


x«G") 9 C- D", Q7) 


where u -v is the scalar product of the vectors u, v over Q. x, is called a 
character of G. x, is extended to act on QG by linearity: 


xe( Zaz") = X «x62 E O a (28) 


veF" ver" ver” 
Note that 
(z") = { 1 ifu, v are orthogonal, 
x -1 ifnot. 


Problems. (12) (i) x. (2?) = xz"). 
(ii) x.(2*)x« (27) = x«(G""). 
(iii) If Ci, C, are arbitrary elements of QG, 
Xu (Ci)Xu (C2) = x, (CC + C2). 
(iv) x.(2")x.G") = x. (G"). 
(13) If € is a linear code and C = X,c«z", then 


[lel ifve@ 
x«C) i if vg € 
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(14) Show that if u € F” has weight i, then 
(-D'" = (Y) = Pri). 


wt (=k 


(15) More about characters. More precisely, a character y of an abelian 
group G is any homomorphism from G to the multiplicative group of the 
complex numbers of absolute value 1. For G as defined at the beginning of 
Section 3, the characters take the values +1. Show that the characters of G 
form a group X which is isomorphic to G (and to F"). Then (27) is just one 
example of an isomorphism between F" and X. 

(16) (An inversion formula) Let 

C= >> qz 
ver" 
and suppose the numbers 
x(C)= $ cC D'" 


ver” 
are known. Show that 
1 " 
E > CD X(C), 
ucF"^ 


and so C is determined by the x.(C). 


$5. MacWilliams theorem for nonlinear codes 


Weight enumerator of an element of the group algebra. Let 
C= > cz 


veF" 
be an arbitrary element of the group algebra QG, with the property that 
M= > c.#0. 
veF" 
We call the (n + D-tuple {Ao..... An}, where 


A; = > Ciy 


wt(v)-i 
the weight distribution of C. This is the natural generalization of the weight 
distribution of a code. Of course X A; = M. As in 82 we also define the weight 
enumerator of C to be 
W.-(x, y)= > Cox WME weee 


ver" 


= $ Aix” y. 


i20 
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Definition. The transform of C is the element C' of QG given by 


zz. $ 

Cow E «OO. Q9) 

where x was defined in $4. 
Suppose 
Qe S es" 
ue F^ 
so that 
Vol 
Qu M Xx«(C) 
l uv n 
-M 2. CD c, u&F*. (30) 


(The c;'s are proportional to the Hadamard transform of the c,'s- see 83 of 
Ch. 2.) Then the weight distribution of C’ is {Ao,..., An}, where 


1 
Ai= à c M mC), (31) 
wt(o)-i wt (u)—i 
and the weight enumerator of C' is 


We(x, y) = X, Aix" ^y". 
i-0 


As in Theorem 1, We is given by a linear transformation of We. 


Theorem 3. 
We, y) = ap Welx +y, x-y), (32) 
or equivalently, using (15), 
AL= ap À APA), k=0,...,n. (33) 


This theorem and Theorem 5 can be thought of as MacWilliams theorems for 
nonlinear codes. 


Proof. (32) is also equivalent to 


> gr by» zu 4 > c, (x + y) "(x ZN y)". 


veF^ uceF" 
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From (30), the LHS is equal to 


ra A X > (- 1)" "x n amt wtlod 


M een ve F" 
= a >, calx y) (x — y)mm, 
by (9) and (11) of 82. Q.E.D. 
For any (n + 1)-tuple {Ao...., An} with 
M-Y A#0 
we call {Ag,..., Al} the MacWilliams transform of the A;'s. By Theorem 3 


this can be obtained either from (31) or from (33). 


Problems. (17) Show that (C^) = C/c, , provided c, #0. 
(18) Show that A= 1. 
(19) Show that 


Ma, 
S Ai. 


(20) Let € be an [n, k] linear code and suppose the coset v + € contains E, 
vectors of weight i. Use (31) to show that the transform E; is a; — Bi, where a; 
is the number of codewords of weight i in €* which are orthogonal to v, and 
B; is the number which are not. Hence show 


À ExUy- 7 (a; — Bx + y)" (x — y). 


Distance distribution of a nonlinear code. Now let € be a linear or nonlinear 
(n, M, d) code. € is described by the element 


(85 2 (34) 
of QG. 


Problem. (21) Show that if € is linear then 
(um 


vegt 


Thus Theorem 1 is a special case of Theorem 3. 
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Now let € be linear or nonlinear. Let D = (1/M)C". If we expand 
D= È dz 


weF" 


the weight distribution of D is {Bo,..., Ba}, where 
B, = > d,,. 


wt(wi-i 


Lemma 4. (B,, ..., B,) is the distance distribution of €. 


Proof. 


i 


Therefore 


5 M ures 


dist (u.v)— i 


Q.E.D. 


By applying Theorem 3 to the element D = (1/M)C? of QG we obtain: 


Theorem 5. The transform of the distance distribution is 


(35) 


Q.E.D. 
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Properties of the B,’s 


Theorem 6. Biz0, i=0,...,n. 


Proof. 
1 
B= 2. XD) = ys PE) 
=a > xu (CY, by Problem 12, (36) 
wt (u) i 
20 Q.E.D. 


-This innocent-looking result turns out to be quite useful -see the linear 
programming bound (§4 of Ch. 17). Another proof is given in Theorem 12 of 
Ch. 21. 


Theorem 7. (a) If 0€ €, B, =0> A; =0. 
(b) B}=0> x,(C) = 0 for all u of weight i> A;=0. 
(c) Bí 02» x,(C) #0 for some u of weight i. 


Proof. (a) is obvious. (b) and (c) follow from (36) and (31). Q.E.D. 
Dual distance and orthogonal arrays. 


Definition. The dual distance d' of a code @ is defined by B;=0 for 
1<i<d’-1, Ba #0. If € is linear, d’ is the minimum distance of €". 


Note that with this definition of d', Equations (18) to (25) and Problems (6) 
to (9) hold for the distance distributions of nonlinear codes. 

Let [€] be the M X n array of all code words of €. If € is linear, any set of 
r «& d' — 1 columns of [€] must be linearly independent, otherwise €^ would 
contain a vector of weight r « d'. (This is the dual statement to Theorem 10 of 
Ch. 1.) This statement is also true for nonlinear codes. We rephrase it slightly. 


Theorem 8. Any set of r & d' - 1 columns of [€] contains each r-tuple exactly. 
MI2' times, and d' is the largest number with this property. 


Remark. An array with this property is called an orthogonal array of n 
constraints, 2 levels, strength d'— 1 and index M/2^'! (see 88 of Ch. 11). 
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Proof. To prove the first half, from the proof of Theorem 7 we have y.(C)=0 
for all u of weight 1,2,...,d' - 1. x,(C) = 0 for all u of weight 1 implies that 
each column of [€] has M/2 zeros and M/2 ones. Then y,(C) = 0 for all u of 
weight 2 implies that each pair of columns contains M/4 occurrences of each 
of 00, 10, 01, 11. Clearly one can go on like this up to d’— 1. Conversely, since 
B; +0, there is a u of weight d' such that y,(C) #0. Q.E.D. 


Examples. (i) The (11, 12, 6) Hadamard code 54, ($3 of Ch. 2). Since this is a 
simplex code and contains 0, the weight and distance distributions are equal: 
i: 0 6 


Loy! 5.6 
A=B: 1 11’ Wax, y) =x" Hx5y*. 


The transformed distribution is obtained from 


5 WanlX +y, x-y) = 5 ((x + y)" + 11(x +y) x- y») 


11 
zi > Bix "yi, 
i20 
and is equal to 


i: 03 4 5 6 7 81 
Bi: 1 184 363 29129! 363 18). 1 


Note that E Bí — 1703 = 2"/12. Also B,z 0, as required by Theorem 6. This 
table shows that d' = 3. 

The list of codewords [%2] is found in the top half of Fig. 2.1. To illustrate 
Theorem 8 we observe that in any two columns of [.%,2].the vectors 00, 01, 10, 
11 each occur 3 times. 

(ii) The (8, 16,2) code shown in Fig. 5.1. 


— — M o o o — C 
oooooor-o 
oooooro°o 
oooorooco 
oooroococo 
ooroocoooco 
orooocoooco 
-0000000 
ooooooo,} 
p pd to 
— — SS pd Cox ox 
— — — — QD om m ox 
— p o SO pd pd pa 
— — OD = p — — pt 
— CQ = m m æ o om 
O m= e e o om ox B 


Fig. 5.1. An (8,16,2) nonlinear code. 


See Problem 22 of Ch. 15. Again the weight and distance distributions 
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coincide: 
i: 0268 
A; = B;: 1 7 7 1 


The transformed distribution is obtained from W(x + y, x — y). But, after 
simplification, this turns out to be equal to W(x, y)! So that for this code 
B; = B. Also d'= 2. 

This code has another unusual property, namely D' = D. The proof of this 
is a good exercise in the use of the group algebra: 


W(x, y) 2 x? - 7x5y? + 7x?y*  y*. 


Problem. (22) Show that 


1 1 
CH=1t2z22+++++2,2%et22°°° Zg pitty) + ZZ Ze, 
2 8 


and that 
-Le 
D= is € 
-dEi z: ad E 2s 
D' = D. 


Note that if we only know D, in general C cannot be recovered. 

(iii) The (16, 256,6) Nordstrom-Robinson code Ni. (§8 of Ch. 2). Once 
again A; = B;- see Fig. 2.19. This code also has the property that Bi = B, 
although it is laborious to show this directly. It will follow from the results of 
Ch. 15. 


Problem. (23) Let € be a code and C be given by (34). (i) Show D = C iff € is 
linear. 
(ii) Show C’=C iff € is linear and self-dual. 


*$6. Generalised MacWilliams theorems for linear codes 


In this section we describe several weight enumerators of linear codes over 
an arbitrary field F = GF(q) = GF(p"), where p is a prime. Let the elements 
of GF(q) be denoted by wo= 0, 0,,..., @g-1, in some fixed order. 


Complete weight enumerator. The first weight enumerator to be considered 
classifies codewords u in F" according to the number of times each field 
element w; appears in u. 
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Definition. The composition of u= (u,,..., Uun), denoted by comp (u), is 
(So, Sis». >, Sa-1) Where s; = s,(u) is the number of components u; equal to o. 
Clearly 


q—1 
> $i — n. 
iTo 


Let € be a linear [n, k] code over GF(q) and let A(t) be the number of 
codewords u € € with comp (u) = t = (to,..., t,-,). Then the complete weight 
enumerator of € is 


WA 2 ae Za-1) = > A(Dzo + zw 


SS ee oa (37) 


uE 


For example, let €, be the [4, 2, 3] ternary code #6 of Ch. 1. The complete 
weight enumerator is 


WelZo, Zi, Z2) = zo + Zozi + 3Z927Z2 + 3ZoZiZi + Zoz2 
er zo(Zà +(z + z2)°). (38) 


Characters of GF(q). In order to state the theorem we need to define the 
characters of GF(q). Recall from Ch. 4 that any element 8 of GF(q) can be 
written in the form 


B = Bot Bia + foo) Bua", 
or equivalently as an m-tuple 


B = (Bo. Bi. ..., B«-2, 


where a is a primitive element of GF(q), and 0x B, « p — 1. Let £ be the 
complex number e^"^, This is a primitive p™ root of unity, i.e., £^ =e?" = 
1, while &z1 for O</ <p. 


Definition. For each B = (Bo,..., 8, 0) € GF(q) define ya to be the complex- 
valued mapping defined on GF(q) by 


Xa (y) = £P Eno (39) 


for y = (yo..., y»-) € GF(q). xs is called a character of GF(q). 


Problems. (24)Show xa(y) = x,(B), for all B, y € GF(q). 
(24) Show that 


Xa(y + Y) = xa(y)xa (y) (40) 





Ch. 5. §6. Generalised Mac Williams theorems 143 


for all B, y, y' € GF(q). Thus xa is a homomorphism from the additive group 
of GF(q) into the multiplicative group of complex numbers of magnitude 1. 
(26) Show that 


Xa«g Uy) = Xa y)xa (y) 


for all B, B', y € GF(q). Thus the set of all q characters xs forms a group 
which is isomorphic to the additive group of GF(q). 


Example. q=p =3, with GF(3) = {0, 1, 2} and f=o0= 6 = 
cos 120? +i sin 120°., There are three characters: 


x0(0) = 1, xo(1) = 1, Xo(2)= 1 (the trivial character); 
x(0) = l; xi(1) = w, x2) = o 
x2(0) = 1, x2(1) = v, X2(2) = o. 


Lemma 9. For any nonzero B € GF(q), 
> Xa (y) = 0. 


y€GF(q) 


Proof. The sum is equal to 
" IT pi 
Yt + Bm 1Ym-1 = ^). 
yEGF(q) d j-0 PX d 


Since B x 0 there is a nonzero B; say 8,# 0. Then the r* factor in the above 
product is 





> gh 5 g= DE =0. Q.E.D. 


Example. In GF(3) with 8 = 1 the lemma says 


2 
No-1-9-o-0. 


y=0 


To give the next theorem we must choose any one of the characters ys 
with 8» 0. For concreteness we choose B = 1, i.e., the character x, defined by 


xy) = €^, for y = (yos... y4-0€ GF(q). (41) 


2nilp 


If q is a prime p, this is simply yi(y) = £”, y € GF(p), where € =e 


MacWilliams theorem for complete weight enumerators. 


Theorem 10. If € is a linear [n, k] code over GF(q) with complete weight 
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enumerator We, the complete weight enumerator of the dual code 6+ is 


1 = 


q—-1 
Wae(zo,..., 25... Za) = re] w(> Xi(0009,)Z2,, ..., 2^ Xie, )2s, ES ) 
(42) 


Example. For a code over GF(3) the theorem states that 


d. 


A WelZo + Zi Za, Zo + WZ, + W722, Zo + wzi + wz) (43) 


We:(Zo, Zi, Z2) = 


where w = e?"^, I.e., We: is obtained by applying the linear transformation 


11 ! 
1 2 
aod (44) 


to We, and dividing the result by |€|. E.g., for the code €, given above, 


Wei Zoy zi, Z2) = s(Zo + zi + Z2)[(Zo + zi + 22)’ + Qzo— zi — z2)°]. 


Problem. (27) Show that €, is self dual. Check this by showing 
Wei = We. 


The theorem depends on the following lemma. 


Lemma 11. For u,v E F” let x,(v)= x(u: v). As in Lemma 2, the Hadamard 
transform f of a mapping f defined on F" is given by 


fw = X xfw). 


veF" 


Then if € is any [n, k] code over GF(q), 


Y fw- ral > fw. (45) 


ue 


The proof is essentially the same as that of Lemma 2, but requires the use 
of Lemma 9. 


Proof of Theorem 10. We apply Lemma 11 with 


f(u) = ze mmn 
q— 3 
2 (o) _ yf) 
f= >, xum nS. 
ver" 


ES I (S xow). 
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in the same way that (11) is obtained from (9). Then Equation (45) gives the 
theorem. Q.E.D. 


Lee weight enumerator. By setting certain variables equal to each other in the 
complete weight enumerator we obtain the Lee and Hamming weight 
enumerators, which give progressively less and less information about the 
code, but become easier to handle. 


Definition. Suppose now that q = 26+1 is an odd prime power, and let the 
elements of GF(q) be labeled wo = 0, w,,..., Ws, (551, ^, Q4-,, Where wa-i = 
—o« for 1s i sô. E.g., we take 
GF(5) = {wo = 0, w, = 1, w = 2, w = —2 = 3, w = — 1 = 4}. 

The Lee composition of a vector u € F", denoted by Lee (u), is (lo, l,,... , Is) 
where lo = so(u), li = si(u)+s,-(u) for 1«i «6. 

E.g., for q = 5, the Lee composition classifies codewords according to the 
number of components which are 0, the number which are +1, and the 
number which are +2. 


The Lee enumerator of code € is 


ig d t 
Saline Ze) = D Zozi Za 
ue€ 


For example, the self-dual code % over GF(5) of length 2 consisting of the 
codewords 


00, 12, 2- 1, -21, -1-2, 


has Lee enumerator zà + 4ziz;. 


MacWilliams theorem for Lee enumerators. 


Theorem 12. The Lee enumerator for the dual code €* is obtained from the 
Lee enumerator of € by replacing each z; by 


Zot » bx (oio) T xi 0, )) Z;. 


and dividing the result by |6|. 
Proof. Set z,-; = z for 1i =ô in Theorem 10. Q.E.D. 


For the code » in the preceding example, we have y.(ww,) = o^ where 
a =e?" = cos 72° + i sin 72°, and the transformation of the theorem replaces 
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(4 ( IH 
Z l at+a‘* æ +a fiz, 
Z2 by l œ? +a at+a‘*/ Mz (46) 


Since o is self-dual the theorem asserts correctly that 


Zo + Aziz; = 3[(Zo + 221 + 222) + Azo + (a + a^)zi + (a? + a`)z:HZo + (a? + a7)zi 


+ (a + a^)2;)]. 


Problem. (28) Verify this identity. 


Hamming weight enumerator. Now let q be any prime power. As in $3 of Ch. 
1 the Hamming weight, or simply the weight, of a vector u = (u,,...,u)€ F" 
is the number of nonzero components u, and is denoted by wt (u). We shall 
use the notation of $2 and let A; be the number of codewords of weight i, and 


Wa(x, y) = 9 Ax" ^y! 
i-0 
z > x" WI GS run 
uE% 


be the Hamming weight enumerator of a code €. 


MacWilliams theorem for Hamming weight enumerators. 


Theorem 13. 
Woo =r We(x +(q—- Dy, x= y). (47) 
Proof. In Theorem 10 put z= x, zi = 22=°** = 2Z-1= y, and use Lemma 9. 
Q.E.D. 


Example. For the code » of the preceding example, Wa = x? + 4y?, and the 
theorem asserts correctly that 


x?’ +4y’ = E [(x + 4yY + 4(x — yy]. 
When q = 2, Theorem 13 reduces to Theorem 1. 
A weight enumerator which completely specifies the code. By introducing 
enough variables it is possible to specify the code completely. An example will 


make this clear. Suppose F = GF(5). We describe the vector u = (2,0, 4) € F' 
by the polynomial z,222Zu. In general, the variables z; means that the i^ place 
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in the vector u is the j element œw; of F. The vector U = (Wa Wan >>>, Wa,) IS 
described by the polynomial 
f(u) = Zia,Z2a; S E Zna, (48) 
Thus u is uniquely determined by f(u). This requires the use of nq variables 
z; (1s<i<n,0<j<q-1). (This is similar to what we did in $3 for binary 
codes). 
What we shall call the exact enumerator of a code € is then defined as 


E= Y f(u). 


uE% 


MacWilliams theorem for exact enumerators. 


Theorem 14. e: is obtained from Ee by replacing each z; by 
4-1 


> X1(W0, )Zis (49) 


and dividing the result by |6|. 
Proof. Use Lemma 11 with f(u) defined by (48). Q.E.D. 


Theorem 14 is a very general version of the MacWilliams theorem and of 
course all the earlier theorems follow from it. In the remainder of this section 
we give a few more corollaries which give information about the contents of 
certain coordinate positions of the codewords. 


Joint weight enumerators. The joint weight enumerator of two codes # and 8 
measures the overlap between the zeros in a typical codeword of «£4 and a 
typical codeword of 2. This generalizes the Hamming weight enumerator just 
as a joint probability-density function generalizes a single density function. 
For simplicity we only consider the binary case, with F = GF(2). 
For U -(u,..., Un), v =(01,...,0n) EF” let 
i(u, v) = number of r such that u, —- 0, v, = 0, 
J(u, v) = number of r such that u, - 0, v, — l, 
k(u, v) = number of r such that u, — 1, v,=0, 
l(u, v) = number of r such that u = 1, v,— l. 
Of course 
i(u, v) - j(u, v) - k(u, v) -l(u, v) n 
j(u, v) +1(u, v) = wt (v), 
k(u, v) + l(u, v) = wt (u). 
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The joint weight enumerator of of and B is 


Jaala, b, C. d) E S > qh Opie Deka glos o (50) 


uc pcd 


The joint weight enumerator of a code s£ with itself is called the biweight 
enumerator of £, 


Problems. Establish the following properties of the joint weight enumerator. 


(29) San (1, 1, 1, 1) = |AB|, 
fa«(a, b, c, d) = Jaala, c, b, d). 


The single weight enumerators are given by 
l 
W(X, y) = jal eee X, y. y), 


1 
Wa(x,y)-7 jaj 228 y. X. y). 


W(X, y) = $45(x.0,y,0), provided 0€ B 
(X. y) = $aa(X, y. 0,0), provided 0€ of 


Also 
W.(x, y)Wa(z. t) = Jaa (xz, xt, yz, yt). 
(30) If o = (0, 1} = repetition code of length n, 

Íaala, b. c, d) = a" c b^ c c" e d". 

If £ = (0), B arbitrary, 
Aaa, b, c, d) * Wala, b). 

If arbitrary, @ = F” = all codewords of length n, 

f «s(a, b, c, d) - Wala * b, c +d) 
If of arbitrary, @ = (all even weight vectors}, 


EE e (a+b, c +d) +3 Wa (a=b esd) 


(31) Ía: ala, b,c, d)= psala +c, b+d,a-c,b -d). 


| 
aa, b,c. d)= y fala +b, a-—b,c+d,c-—d). 
(32) Jua (a,b,c, d)= aya fana bed. a—b-c d, 


atb—-c-d,a-b-c-«d) (51) 
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(33) For the [n = 2" — 1, m,2"^'] simplex code = dual of Hamming code, for 
m z2, 
Jaala, b, c, d) = a" + na? pot + cry qur 


n(n — 1) 
a 


+ (abcdy'"**"* 


(34) For the [n = 2” — 1,2” —m — 1,3] Hamming code, for m = 2, 


or gU (o4 — 20 + 80,1)" ^ 


1 
(n +1) | a; 
+ no" {(a— b ES c-dy) 


*(a*b-c- dy" «(a-b-c«dye"]. 


Jaala, b, c, d) « 


where o denotes a symmetric function of a, b, c, d: 
o=a't+bi+c'+d' 
c; 7 a'(b! c cl - d) *b'(a^ -c-d)*:, iF], 
di = ab! + a'c' + a'd' + b'c' + b'd' + c'd' 
Oni = a'bicid’. 
(35) For the [n = 2", m - 1,2" '] first-order Reed-Muller code, for m z 2, 


S aua, b, C, d) =o, + 2(n m DOn * 4(n =, Din = Qorin. 


(36) To emphasize that the biweight enumerator gives more information 
about a code than does the weight enumerator, show that the codes generated 
by {110000, 001100, 000011} and {110000, 011000, 001111} have the same 
weight enumerator but different biweight enumerators. Another such pair 
consists of the [32, 16,8] second-order Reed-Muller code and the quadratic 
residue code with the same parameters (see Ch. 19). 


Split weight enumerator. In many codes the vectors are divided naturally into 
a left half and a right half. For example, codes formed from the |u | u + v| 
construction ($9 of Ch. 2), Reed-Muller codes and codes obtained from them 
(see Chs. 13 and 15). For such codes it is useful to keep track of the weights 
of the two halves separately. 

The left and right weight of a vector v —(v.,..., DIE EN ERN Vom) are 
respectively wi — wt(t,..., Um), Wr = Wt (Ums, ss, Von). The split weight 
enumerator of a [2m, k] code € is 


S. (x y X Y)= > x mu Cody mU Mw RE watt 
v€'6 


Problems. (37) Let € and D be codes of length n with weight enumerators 
W(x, y), W(x, y) and split weight enumerators S^ (x, y, X, Y), FAx, y, X, Y). 
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Show that the direct sum 
€OG9*-s1lulv: ue€vega 


has weight enumerator W,(x, y)W.(x,y) and split weight enumerator 
W(x, y)W.(X, Y). On the other hand the code 


ED = {lu |v |u" |o" u-|w[u'e € v-|v'|v"e 2), 


(where u and v have each been broken into two equal halves) has the 
same weight enumerator but its split weight enumerator is 
Fx, y, X, Y)F Ax, y, X, Y). Notice that € | 9 can be obtained from € @ 2 by 
permuting the coordinates of the codewords. As an example take «€ = 9 = 
(00, 11), with 


W(x, y) =x - y, S. (x, y. X. Y) - xX + yy, 
Then the split weight enumerators of 


€ D 9 = (0000, 0011, 1100, 1111) 
and 
€ || 2 = (0000, 0101, 1010, 1111} 


are respectively W(x, y) W.(X, Y) = (x? - y’)(X?+ Y°) and F(x, y, X, YY = 
(xX *- yYy. 
(38) Prove 


SexyX, Y) = Si(xt+y,x-y,X+Y,X-Y), (52) 


Es 
I| 
(39) Give the split weight enumerator of a first-order Reed-Muller code. 
(40) Let € be a linear code of length 2m, and suppose the first m symbols 
of each codeword are sent over a binary symmetric channel with error 
probability p, and the last m symbols over a channel with error probability P. 
Let 
d, = min (aw. (u) + a2wn(u)), 


where the minimum is taken over all nonzero u in €, and a, = log(1 — p)/p, 
a = log (1 — P)/P. Show that € can correct all error vectors e satisfying 


aw, (e) aaWa(e) <3 d.. 


(If the split weight enumerator of € is known, d, can be easily obtained.) 


§7. Properties of Krawtchouk polynomials 


The results of §3-§5 dealing with weight enumerators of nonlinear codes 
can also be generalized to GF(q). To do so requires a slightly more general 
version of the Krawtchouk polynomials which we give in this section. 
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Definition. For any prime power q and positive integer n, define the 
Krawtchouk polynomial 


n-—-x 


P. n)» P.) = $ cosy ()(5 3 


where y = q — 1, and the binomial coefficients are defined in Problem 18 of 
Ch. 1. These polynomials have the generating function 


) k 20,1,...,n, (53) 


(1* yz) "(10-zy = > P,(x)z*. (54) 


If x is an integer with 0x x « n the upper limit of summation can be replaced 
by n. 


Theorem 15. (Alternative expressions.) 


ò P= X or (N) (55) 
(i) pie ce ET) (56) 
Proof. (1) 





(14 yz) *(10— zy = (1+ y» (1 5 eA 


= S EN i n-j x 
2 qz) (14 yz) (o 
The coefficient of z* in this is 
i : 
= saut fi AX 
P(x) à qyy GG): 


(ii) The proof is similar, starting from 





(14 yz) "(1 zy =(1- »(1 +E )-- Q.E.D. 


Thus P,(x) is a polynomial of degree k in x, with leading coefficient 
(-—q)*/k! and constant term 


P,(0) = (v (57) 


Theorem 16. (Orthogonality relations.) For nonnegative integers r,s, 
> (D)ve«op.c = ary (7)... (58) 


where 6,, = 1 if r= s, 6, =90 if rzs is the Kronecker symbol. 
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Proof. The LHS is the coefficient of y'z' in 


> (Mya + yy)" 0 — yy + yz)" (107 zy 


7 [(1 + yyX1 + yz) + y(1 — yX1— 2) 
=q"(1+yyz)". Q.E.D. 


Theorem 17. For nonnegative integers i, s, 


y( e. = ¥(")Pas). (59) 
Proof. This follows at once from (53) by rearranging the binomial coefficients. 


Corollary 18. 


2, Prli)Pils) = 478... 


Proof. This is immediate from Theorems 16 and 17. Q.E.D. 


Theorem 19. (Recurrence.) The Krawtchouk polynomials satisfy a three-term 
recurrence: 


(k - 1)P, (x) = [(n — k)y +k - qx) Pi (x) - y(n - k + DP,.(x), (60) 


for k 21,2,..., with initial values Po(x) = 1, P,(x) = yn — qx. 


Proof. Differentiate (54) with respect to z, multiply by (1+ yzY(1 —z), and 
equate coefficients of z*. Q.E.D. 


Theorem 20. If the Krawtchouk expansion of a polynomial a(x) of degree t is 
a(x)= Y a.P.(x), (61) 
k=0 


then the coefficients are given by 


a =q" Ý P4. 


Proof. Multiply (61) by P,(/), set x = i, sum on i, and use Corollary 18. Q.E.D. 
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Problems. (41) Show 
AC )ne-0;)) 
(42) Show 


Pé(x;n)-P(x;n)t- c P.(x;n)- PL(x—- E; n- l0). 


In the remaining problems q = 2. 
(43) Show that P,(i) = (- 1YP, (0, OS i n. 
(44) Show P,(k) = (— D*, P.(1) = (1 — (2k/n))(D. 
(45) Show that 


Y PK) = 2^8,,, 


Y. Pitk) = 2^ (08,5 — Bua), 


i-0 


> PP.) = 2"{n(n + 1)8co— 281 + 28,2}. 


(46) For nonnegative integers i and k show that 
(n —k)Pi(k + 1) = (n — 21) P(k) — kP,(k — 1). 


[Hint: Theorems 17 and 19.] 

(47) Because the Krawtchouk polynomials are orthogonal (Theorem 16), 
many of the results of Szegó's book [1297] apply to them. For example, prove 
the Christoffel-Darboux formula (cf. Theorem 3.2.2 of [1297]): 


(") coro (D) =} ce + pP- PLP) 


y-x 


[Hint: Use Theorem 19 to show that 


(t+ DEP, GP) POP) (7) 
= PG)P A) - P. Pon, à ) 


—2(x — PoBo) / (P), 


then sum on t.] 


Notes on Chapter 5 


$2. Theorems 1, 10, and 13 are due to MacWilliams [871, 872]. Lemma 2 is 
from Van Lint [848, p. 120]. Lemmas 2 and 11 are versions of the Poisson 
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summation formula - see Dym and McKean [396, p. 220]. Theorem 1 has also 
been proved by the methods of combinatorial geometry - see Greene [560]. 

Krawtchouk polynomials were defined by Krawtchouk [782]; see also 
Dunk! [393], Dunk! and Ramirez [394], Eagleson [398], Krawtchouk [783], 
Vere-Jones [1370], and especially Szegó [1297]. They were first explicitly 
used in coding theory by Delsarte [350-352]. We shall meet these polynomials 
again in Ch. 21 as the eigenvalues of a certain association scheme. 

The power moment identities were given by Pless [1051]. See also Stanley 
[1261] and Zierler [1464]. Problem 8 is also from Pless [1051]. For more about 
Stirling numbers see Riordan (1113, p. 33]. Berlekamp [113, $16.2] gives a 
more general family of moments. 

Let the burst-length of a vector be the distance between the first 1 and the 
last 1. Korzkik [777] considers the distribution of codewords according to 
their burst length. Problem 4 was suggested by our colleague John I. Smith. 


883,4. The group algebra (also called the group ring) and characters can be 
found in many books on algebra, e.g. Mann [907, p. 73]. They can be 
generalized to codes over GF(4), along with the results of $5. 


$5. The nonlinear MacWilliams theorem can be found in MacWilliams et al. 
[886] and in Zierler [1468]. McEliece [942] has given a version of this theorem 
in which the underlying alphabet need not be a field. 

Theorems 6, 7, and 8 are due to Delsarte [350]. Orthogonal arrays were 
defined by Bose and Bush [181]; see also Bush [220], Hall [587] and 
Raghavarao [1085]. Problem 20 is due to Assmus and Mattson [47]. 


$6. For the Lee enumerator and codes for the Lee metric see Lee [801], 
Berlekamp [115], Golomb et al. (524, 525, 529, 531, 532], Mazur [934] and the 
Notes to Ch. 6. Theorem 12 is from MacWilliams et al. (883]. The remaining 
weight enumerators in this section were defined in [883] and by Mallows and 
Sloane [895]. Katayama [747] has given a version of the MacWilliams theorem 
which applies to r-tuples of codewords (generalizing (50) and (51), which are 
the case r= 2), and has applied it to the Hamming code and its dual. 


87. Szegó [1297] and Erdelyi et al. [410, Ch. 10] are good references for 
orthogonal polynomials. 





Codes, designs and perfect 
codes 


$1. Introduction 


This chapter continues the study of set-theoretic properties of codes begun 
in Chapter 5. 

We begin in §2 by defining four fundamental parameters d, d', s, s' 
associated with any linear or nonlinear code €. d is the minimum distance 
between codewords of €, and s is the number of different nonzero distances 
between codewords. For a linear code d' and s' are the analogous quantities 
for the dual code €/. For a nonlinear code d’, still called the dual distance, 
was defined in $5 of Ch. 5; and s’ is the number of subscripts iz 0 such that 
Bi 0, where B; is as usual the transform of the distance distribution. s’ is 
called the external distance of the code, because any vector is at distance x s' 
from at least one codeword of «€. (Theorem 21 of $6. However, s' need not be 
the smallest integer with this property: see Problem 11.) 

A number of interesting codes are such that either s « d’ or s' « d. When 
this happens the minimum distance (or the dual distance) is at least as large as 
the number of unknowns in the distance distribution, and the MacWilliams 
identities can be solved. From this it follows that such codes have three 
remarkable properties, which we establish in $83, 4, 7: 


(a) The number of codewords at distance i from a codeword v is inde- 
pendent of the choice of v. Speaking loosely, the view of the code from any 
codeword is the same as the view from any other codeword (Theorem 3 and 
Corollary 5 of 83). Such a code is called distance invariant. If the code 
contains the codeword 0, then the weight and distance distributions coincide. 
(Of course property (a) always holds for linear codes.) 

(b) If 0, 7,,..., 7, are the weights occurring in the code, then the number of 
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codewords of weight 7, is an explicit function of n, M, and the numbers r; 
(Theorems 2 and 4 of $3 and Theorem 7 of §4). 

(c) The codewords of each weight form a t-design, where t is at least d' - s 
or d — s'. (Actually we prove a slightly stronger result than this. See Theorem 
9 of $4, Corollary 14 of $5, and Theorem 24 of $7.) 

In $5 we show that if the codewords of each weight in any binary linear 
code € form t-designs then so do the codewords of each weight in €* 
(Theorem 13). 

$6 studies the weight distribution of the translates of any linear or non- 
linear code. 

$8 establishes two properties of perfect codes: the fact that a code is 
perfect iff s’= e (Theorem 27), and Lloyd’s necessary condition for a code to 
be perfect (Theorem 28). 

Up to this point in the chapter all the codes considered have been binary. 
Then in $9 we discuss how the results may be generalized to codes over 
GF(q). We only consider constructing designs from linear codes over GF(q) 
(Theorem 29). When restricted to binary codes Theorem 29 gives a result 
which is apparently stronger than Corollary 14 (see Corollary 31). 

Finally in $10 we prove the Tietàvàinen-Van Lint theorem (Theorem 33) 
that the only nontrivial perfect codes over any field are those with the 
parameters n, M, and d of the Hamming or Golay codes. 


$2. Four fundamental parameters of a code 


Let € be an (n, M, d) binary code, not necessarily linear, which contains 0. 
Suppose (B) is the distance distribution of €, i.e., B; is the number of ordered 
pairs of codewords at a distance i apart, divided by M. Thus By = 1. (See 81 of 


Ch. 2.) 
Let 0, Ti, 72,...,7, be the subscripts i for which B, 0, where 


OSTELLI EN. 


Then d = 7, is the minimum distance between codewords of €, and s is the 
number of distinct (nonzero) distances between codewords. 
Let {B'} be the MacWilliams transform of {B,}, given by (see §5 of Ch. 5) 


1+ È By = M2 "È Bilt yy" ‘d-yy, (1) 


or equivalently by Equations (35) or (36) of Ch. 5. Suppose 0, o,..., Os, are 
the subscripts i for which B; #0, where 
0cog, c: «o0. xn. 


We call s' the external distance of €, and d' = a, is the dual distance. We saw 
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in Theorem 8 of Ch. 5 that d' is the largest number such that each (d'— 1)- 
subset of the coordinates of € contains all (d' — 1)-tuples an equal number of 
times. 

The weight distribution of € is {A,;}, where A, is the number of codewords 
of weight i, and (A!) is the MacWilliams transform of {Aj}. 


Lemma 1. The number of nonzero A; is at most s, the number of nonzero A; is 
at most s', and 


A;-0 for O<i<d, 
A;=0 for O«i«d'. 


Proof. The statements about A; follow from the definitions of d and s. The 
statements about A; follow. from Theorem 7 of Ch. 5. Q.E.D. 


Suppose now we change the origin, i.e., replace € by €* = € + v, where v 
is a codeword. The new code €* is still an (n, M, d) code with the distance 
distribution (B;) and a possibly different weight distribution {A*} say. 
However Lemma 1 is still true for (A*). 

Note: If € is linear, then A; = B; = weight distribution of €, A;= B= 
weight distribution of €+, and d' is the minimum distance of «€. 

Our first goal in this chapter is to study the interesting properties of codes 
for which either s « d' or s' « d. Some examples of such codes are: 


(E1) The [n, n — 1, 2] even weight code, whose dual is the [n, 1, n] repetition 
code. Here d 22, s-[;n], d'2n, s'- 1. 

(E2) The [n = 2" — T, m, 27^'] simplex code, whose dual is the [2" — 1, 27 — 
1—^m,3] Hamming code. Here d=2""', s = 1, d'z3, s'—-n-— 4. 


Problem. (1) More generally, for the (n, n + 1,3(n + 1)) Hadamard code 5£,., of 
$3 of Ch. 2, (where n=3 (mod4)), show that d=3(n +1), s=1, d'=3, 
s'=n—4. 


(E3) The [n = 2", m + 1,2" '] first-order Reed-Muller code (89 of Ch. 1), 
whose dual is the [2", 2" — m — 1,4] extended Hamming code. Here d = 2", 
s=2, d'= 4, s$'«in — 2. 


Problem. (2) Show that the Hadamard code €, also has these parameters. 


(E4) The [24, 12, 8] extended Golay code (86 of Ch. 2), which is self-dual, 
has d= d'=8, s=s'=4., 

(E5) The (16, 256,6) Nordstrom-Robinson code Wi. (88 of Ch. 2), has 
d=d'=6, s = s'=4 (see the end of $5 of Ch. 5). 
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Other examples (which include some quadratic residue codes, Reed- 
Muller, Kerdock and Preparata codes) will be found in later chapters. 


Problem. (3) Show that for the code (00000, 00011, 00101, 11011}, A; = B; for all 
i, yet the code is not distance invariant. 


83. An explicit formula for the weight and distance distribution 


Theorem 2. If s <d', then an explicit formula for the distance distribution {B;} 
in terms of n, M, and the 7,’s is 
pac AtS (ac l<i<s, 
N t=0 t 


, 
j-21Tj — Ti jet Tj — Ti 
isi i*i 





where N = 2"/M. (N.B. An empty product is equal to 1, by convention.) 


Proof. As in §2 of Ch. 5, we differentiate (1) j times, divide by j!, and set 
y= 1., This gives s equations 


"B 22. 

ZBON 1, 
X(7)5.-2-(". 1sj< 2 
2 (j) v) rerit e 


To solve these we set B, = a„ + b,, where 


s . n-—j 
S (ai (7) 0<j<s-1, 





izi \J 
A Ti " 
> (5-9. osi 3) 
i-1 
The matrix of coefficients for both systems is 
1 1 1 
Ti T2 T; 


The inverse of T is obtained as follows. Let 


s 

Tj—X 
AX) = =e 

f(x) = ]Iz— 

isi 
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Expand f, (x) in binomial coefficients, viz. 


fo = S fu (7). 


where 
x . . 
()-x«-»- i+ 1)/jt. 
Clearly 
2 fa (*) = f 84. 
and so 
T^ = [fj]. 
Now multiply the column vectors on the right hand side of Equation (3) by 
T '. We obtain 
= d s-i "e n 
y » N j-0 e (7) 
Now 
n— E n t 
(i= ECCO - 2 06) 
] ek ] » t/N 
Thus 





Q.E.D. 


Theorem 3. If s € d', A, = B, for all i. 


Proof. A;# 0 implies B; 0, so the only possible nonzero A;'s are Ao, A4,..., 
A,,. From Lemma 1, A; = 0 for 1 £ i < d' — 1. Thus the A,'s satisfy (2), and the 
proof of Theorem 2 applies. Q.E.D. 


Problem. (4) Given that the only distances occurring in the Nordstrom- 
Robinson code Ns are 0, 6, 8, 10, 16, and that d' = 6, use Theorem 2 to obtain 
the distance distribution. 
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Theorem 4. Suppose s' =d. Then Ai B; for all i, and 


Aj = Bs -[I 2i taal) Gt isiss. 








iu OO Teen de 
Proof. Exactly as for Theorems 2 and 3. Q.E.D. 
Corollary 5. If s' & d, A; = B; for all i. 
Proof. {Ai}, {B;} are the MacWilliams transforms of {4}, {Bi}. Q.E.D. 


Theorem 6. If s «d' or s'<d the code is distance invariant. 


Proof. Theorem 3 and Corollary 5 hold for any translate of € by a codeword. 
Q.E.D. 


Remsrk. Although the conditions s € d' or s' «d are sufficient for A; = B, 
they are not necessary, as shown by the code of Fig. 5.1 or by problem 3 of 
this chapter. 


$4. Designs from codes when s x d' 


In this section we show that if s = d' then the codewords of each weight 
form a t-design (see the definition in $5 of Ch. 2). To begin with we restate 
slightly the formula for A.. 

From Theorem 3, if s € d', then A, = B, =0 or 1. We now assume that A, 
is known (it usually is), so that the unknown A/'s are A4,..... A,, where 


2 Tp n lw (nippm-r Ow 
A I] eye (UA Au (4) 
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Proof. As for Theorem 2, except that Equation (2) is now written as 


E) n= Gro an) (p) 6 


Q.E.D. 
If A, 2l, then 7;,,;=n—7; Thus the 1* and 3 terms in (4) can be 
combined into 
BETON H . 
(- l +€ 1) ze. Ti > 
n — Ti jot Tj — Ti 
Let S(x) = Ilj., (r; — x). Then (4) can be written: 
AE 530 CAs) I< (") S(r) 
An IIo -m- A" (mme, S SCIO. (6) 


j*i 


The codewords of weight v, form a t-design. Let u be a fixed vector of F" of 
weight t, where 0 < t < d'. For 7; > t, let A,,(u) be the number of codewords of 
€ of weight 7, which cover u. 


Theorem 8. The numbers A,(u) satisfy the d' —t equations 


A [n-t _ M [n-tV 2"! (n-t UP ee 
> ( ) a.) = 24 ( ; )-Er- us ) 0xjed'-1-t (7 





Proof. Use Theorem 8 of Ch. 5 to count in two ways the vectors of weight t + j 
which cover u and are covered by a codeword of €. Q.E.D. 


Since the vector 1 covers everything, we may write (7) as 


$C ac 6-47) " 


If we can choose t so that d' — t = 5, we have § linearly independent equations 
for the $ unknowns A, (u). Then the A,, do not depend on the particular choice 
of u. Thus the codewords of each weight 7, form a t-design, where t = d'— s. 
The parameters of this design are obtained as follows. 
Equations (8) are of the same form as (5), and the solution is: 
i n-t 


AB Cri = t) S N 2, (" : ‘) gar) ve Ang. (nt = t). 





where H 
g4a(x)— II —t- x). 


jz*i 
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Clearly g,_.(x — t) = g,(x), so the solution is 


En lT) An = <> x20 f) en- Angan (N), 


or 








Ol Gene (9) 
i=! n 


i*i 


Thus we have proved: 


Theorem 9, If 5 < d', then the codewords of weight 7, in € form a (d' —3) — 
(n, Ti, À,) design, with A,, given by (9), provided that r, > d' — s. 


Examples. (cont.) (E1) Supposing n to be even, the even weight code contains 
the all-ones vector, so § = s—1—3n — 1. From the theorem, the codewords 
of any weight 2r =3n +1 form an (in + 1)-design. However, since this code 
contains all vectors of weight 2r, this is in fact a 2r-design. Thus the 
conclusion of the theorem is not always the strongest possible result. 

(E2) From the theorem, the codewords of weight (n + 1) in the Hadamard 
code „+, form a 2-design, (in agreement both with problem 11 of Ch. 2 and 
with example (i) at the end of 85 of Ch. 5). 

(E3) Similarly we get 3-designs from the codewords of the first-order 
Reed-Muller code (and the Hadamard code €,). 

If the extended Hamming code is considered to be the main code, the 
primes are interchanged, and d'— $ = 2"! — Gn — 3) = 3. Thus the codewords 
of each weight in the extended Hamming code form a 3-design, agreeing with 
Theorem 15 of Ch. 2. 

(E4) Similarly the codewords of the extended Golay code form 5-designs 
(cf. Corollaries 23, 25 and Theorem 26 of Ch. 2). 

(E5) Similarly the codewords of the Nordstrom-Robinson code give 3- 
(16, 6, 4), 3-(16,8, 3), and 3-(16, 10, 24) designs. 


Problem. (3) Check this. 


Identities satisfied by the weights. Since the codewords of weight 7; form a 
(d' — 5)-design, then they certainly form a p-design for all p < d' — 5. Suppose 
this is a p — (n, T, AS?)-design. 
From (9) for p = 1 we obtain 
A AS) , 1 ) S(r) 
Gi) zn a a —— n 
Ài IIo 7) = Ex ee mnor 


jet 
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Since 
TIA, =n MS 


and A, is given by (6), we obtain the following identity for the weights of the 
code: 


- S(0) * A, [c 395. 250], o () S(r) = 0. (10) 


n-t n-—1m N £% 


Also, if A, = 1, t 
id S(n) = (- p' [I —7)- (~ D'S(0). 


We now consider two cases. If A, = 0, then (10) becomes 


SO - x, () St». (11) 


Similarly, if A, — 1, (10) becomes 


| 


soxi«c 097 x 2; (") so». a2) 


Problem. (6) Prove (12) directly from the expressions for A, given by 
Theorem 2 and Equation (6). Show that (12) is trivial for = odd. 


More generally, since the codewords form a p-design for 2 « p « d' — s, we 
have 


(n —p + DAZ? 2 (7 — p + DAZU. (13) 
Using the expressions for AY? and AZ", from (9), we obtain 
2] wv [(n-p-* ) 
ASQ) - x È, Jem. $i) So. (14) 


Thus we have proved: 


Theorem 10. Suppose § x d'. Then the weights of the codewords satisfy the 
following numerical conditions (where 


S(x) = [c - x): 


(i) If A, = 0, then 


for2xpzd'- s. 
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(ii) If A, = 1, then 


soxi« c »5- x 3 (") so». 


r=0 


for2spzxd'-s. 


Problem. (7) If A, = 0, d'=3ands = § = 1 (all nonzero codewords have the same 
weight 7) show that the codewords can form at most a 2-design; and if they do 
then 7 «3n +1), M =n +1, and M is divisible by 4. 


$5. The dual code also gives designs 


In this section we let € be any [n, k, d] binary linear code with the property 
that for some t <d the codewords of each weight w — 0 form a t-design. 

Let T be a set of t coordinate places. The code obtained by deleting these 
places from € will be denoted by €7. Thus €" is an [n — t, k, d — t] code. Let 
(A7) be the weight distribution of 87. 


Lemma 11. A7 is independent of the choice of T. 


Proof. Let A? be the number of codewords of «€ of weight v which contain 
exactly i coordinates of T. By 85 of Ch. 2, A? does not depend on the choice 
of T. Then neither does 


AT = ALTI + () MU () x Q.E.D. 


Let €*^" be the shortened code obtained by taking those codewords of € 
which are zero on T and deleting the coordinates of T. Thus €°°7 is an 
[n — t, k— t, d] code. Also (€?) *" is the dual of @7 (cf. Figs. 1.11, 1.13). 


Corollary 12. The weight distribution of (€*)°* is independent of the choice of 
T. 


Proof. By Theorem 1 of Ch. 5. Q.E.D. 
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Theorem 13. Let € be an [n, k, d] binary linear code with k > i, such that for 
each weight w>0 the codewords of weight w form a t-design, where t <d. 
Then the codewords of each weight in €^ also form a t-design. 


Remark. If k — 1, in order to have a t-design € must be the repetition code. 
Then €* consists of all even weight vectors and gives trivial designs — see 
example (E1) above. 


Proof. Since k^ 1, € contains codewords of some weight v in the range 
0<v «n, and by hypothesis these codewords form a t-design. 

Pick w such that €^ contains a codeword b (say) of weight w. If w=n 
there is nothing to prove, so we may suppose w « n. 

The first step is the show that w «n — t. (a) Suppose w = n — t. If v—t is 
odd, pick a codeword a in € of weight v, which is 1 where b is 0. 


b=i11 111 000000 
a=000 111 111111 
<-> 
v—t 


Then a - b = v — t = 1 (mod 2), which is impossible. On the other hand if v —t 
is even, pick a to be 1 on all but one of the zeros of b. 


€——- w ——  «——t 
b=111 111 000000 
a=000 111 111110 
— > 
v-t+1 


That this can be done follows from the fact that, in the notation of $5 of Ch. 
2, À-,7À. Now a-b=v—t+1=1 (mod2), again a contradiction. This 
proves w n — t. 

(b) Suppose w =n -t+i=n-(t—i), where i7 O. But this is impossible 
by (a), since a t-design is automatically a (t —i)-design. This proves that 
w<n-t. 

We shall now show that if w <n — t, then the set of codewords of weight w 
in €+ forms a f-design. Let c,,..., c, be the complements of these code- 
words, where s = Al. Let T be any set of t coordinates. 

The number of c/s which are 1 on T is exactly the number of codewords 
of weight w in (€:)**. Thus by Corollary 12 this number is independent of 
the choice of T, and so the c,’s form a t-design. Hence the codewords of 
weight w form the complementary design ($5 of Ch. 2). Q.E.D. 
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Example. (E2) (cont.) From Theorem 9 the codewords in the simplex code 
form a 2-design. From Theorem 13, the codewords of each weight in the 
Hamming code also form a 2-design. 


Combining Theorems 9 and 13 we have: 


Corollary 14. Let € be a linear code with parameters d, s, d', s'. Let $ be as 
above, and 


1 


E b. if A; = 0, 
Ss = 


s'—1 if AL=1. 
If either § < d' or s'< d, then the codewords of weight w in € form a t-design, 
where 

t = max {d'- 5,d — $!), 


provided that t < d. 


Research Problem (6.1). In all nontrivial examples known to us, d'— s is less 
than d and the final proviso of the theorem is unnecessary. Is this always so? 


$6. Weight distribution of translates of a code 


In this section we study the weight distribution of translates of any 
(n, M, d) code €. If f is any vector of F”, let A,(f) be the number of vectors 
of weight i in the translate € + f. Of course if € is linear, € + f is the coset of 
€ containing f. The MacWilliams transform of the {A,(f)} is, from (31) of Ch. 


AID) = 5g XO D, 


where 
C= > z" 


vee 


represents the code in the group algebra notation of Ch. 5. 
When f = 0, A,(0) = A; is the weight distribution of €, and A:0) = Ai. 
Note that (cf. Problems 18, 19 of Ch. 5) 


Auf) = 1, 
n ; E: " 
2, AD = yp AQ", 


which is zero if fÉ «. 
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Theorem 15. The A'(f) are orthogonal: 
, vp [0 if iz j, 


Proof. 
M? È AXDA) = X, xO 2, x«C) Dever. 
fer^ wttu)zi wt(v)ej fer" 
Now 


> (- 1) ‘u+ 


fer” 


b if u t vs O, 
2" ifutv=0. 


If is j, then u + v# 0, which proves the first part of the theorem. If i = j, by 
(36) of Ch. 5, 


M? D agy =r à > a«CO)) = zMPB. (15) 


wt(u)- 


Q.E.D. 
Corollary 16. B; = 0 iff Af) = 0 for all f E F”. 
Proof. Theorem 7 of Ch. 5, and Equation (15). Q.E.D. 


As in §3 of Ch. 5 let 


Lemma 17. If u € F" has weight s, 
t (n — SY _ 
Y= Ev (178) (2) = P. 


r 


where P(x) is a Krawtchouk polynomial. 


Proof. 


x()- 2 CD. 
wt(v)— 
There are (AG) vectors v of weight i which have i—r 1's in the n—s 
coordinates where u is 0, and r 1’s in the s coordinates where u is 1. Each of 
these vectors v contributes (+ 1)’ to the sum. Q.E.D. 
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The annihilator polynomial of € is defined to be 


2" = x 
ac) x HC =), (16) 
where 0, o, O2,..., o. are the subscripts i for which B;# 0. Note that for 
0<i<n either a(i) = 0 or B; = 0. Hence by Corollary 16 A/(f) ¥ 0 for some f 


implies a(i) = 0. 
The expansion of a(x) in terms of Krawtchouk polynomials, 


a(x) = > a:P,(x), 
i=0 
is called the Krawtchouk expansion of a(x), and the a; are called the 
Krawtchouk coefficients. The expansion stops at P,(x) since a(x) is of degree 


s'. Also a, #0 and (from Theorem 20 of Ch. 5) 


aid FÈ a(k)Px(i). 


Lemma 18. 


Proof. To prove that these two elements of the group algebra are equal it is 
enough (by Problem 16 of Ch. 5) to show that x,(LHS) = x, (RHS) for all 
v € F". Now 


x(RHS) = 9, (= D* * 278. 


X.(LHS) = x, (Chx. (> aY.). 


by Problem 12 of Ch. 5. First, if v has weight w >0, by Lemma 17 


Xe (3 a.) = > a;P;(w) = a(w). 


If w is one of the o;'s, a(w)=0 by definition, but if not, x.(C) - 0 by 
Theorem 7 of Ch. 5. Thus in either case 


x.(LHS) = x, (c > aY.) =0, v» 0. (17) 
i=0 
Second, if v = 0, yo(LHS) = x«C)a(0) = M - 2"/M = 2". Q.E.D. 


Property (17) is the reason a(x) is called the annihilator polynomial of @. 
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Lemma 19, If 


has the property that 
x. (c X BY.) =0 forallu€ F^, us 0, 
1-0 
then the annihilator polynomial a(x) of € divides 


BG) = È BP). 


Problem. (9) Prove Lemma 19. 


Theorem 20. For each f € F" the numbers A;(f) are uniquely determined by 
A«f)..... As—i(f). 


Proof. (i) We first show that A,(f) is given in terms of A«(f),.... As—.(f) by 
a, Af) 1- 2 aAi(f). (18) 
(Recall a, #0.) To prove this we calculate 


Chepgo So sup m ogee 
vee 


wtGwv)-i v&€ wt(w)-i 


m > zf where f=v+w, 
9€ € re E^ with 
wt(f*v-i 


>. Aif)z‘, 


JEF 


since A,(f) is the number of codewords v at distance i from f. Then 
> > wAdf)z! = CY o; Y, = > zf, (by Lemma 18) 
JEF” i=0 i=0 fer" 
so that 
o; A) = ` (19) 


which proves (18). 
(ii) If we expand 


s'al 


xa(x)= 2 BPi(x), 
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then the proof of Lemma 18 shows 


C È BY. 0, 
i=0 
hence 
sed 
> B. AQ) =0 
t=0 
giving A,.,(f). To obtain A,+2(f) we expand x'a(x), and so on. Q.E.D. 


Remark. The recurrence formula for Krawtchouk polynomials (Theorem 19 
of Ch. 5) is helpful in obtaining the expansion of xa(x) from that of a(x). 


Example (1) For the Hamming code # of length n, since the code is perfect 
there are just two types of cosets, namely the code itself, and n cosets with 
coset leader of weight 1. If fE X, A) =A; If f£ #, Ai nA(Q) = (3), so 
A(f) = (1/n(() — A). For n =7, see Fig. 6.1 


number]0 1234567 





Fig. 6.1. Weight distribution of cosets of Hamming code of length 7. 


We illustrate Theorem 20 by verifying the second line of this figure. 
For this code s'—1, o,=4, so a(x) =8(1 —4x) =8— 2x = P\(x) + Px). 
Hence a,— a; = 1, and A,(f)=1—A,(f). Since f 3€, Aq) =0, A.(f) = 1. 
The recurrence (Theorem 19 of Ch. 5) gives 


xP.(x) = — Xk + DPua(Qx) + 3Pi(x) — (4 - 3k) Peal), 


xP (x) = —3P\(x) + 3P (x), 
xP (x) = — P(x) +3P (x) - 3P (x), 
xP(x) = ~3P3(x) + 3P2(x) - 3P (x). 


Next, xa(x) = xP.(x)+xPo{x) = — P2.(x)+3P,(x). Therefore 3A,(f) - Af) = 
0, so Axf) = 3. Again, 


xalx) = — xP2(x) + 3xP (x) 
=3 Xx) — #P(x) sk ZP(x) = FP (x), 
3A3(f) — Af) + FAW) - FAC) = 0, 
which gives A,(f) = 4, verifying the second line of Fig. 6.1. 
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Example (2) The Nordstrom-Robinson code Wi. (88 of Ch. 2) has distance 
distribution Bo= B= 1, B.e= Bio— 112, B= 30, and the transformed dis- 
tribution is the same, B;= B, Thus the annihilator polynomial is 


saa ie i808 
= aP dx) + P(x) + PAX) + P(x) + Polx), 


xa(x) = —aPx) - BPAx) + 3P(x). 


Continuing in this way we find the weight distributions of the translates of Wi. 
shown in Fig. 6.2. 


Fig. 6.2. Weight distribution of translates of Wi. 


Number Weight 02 4 6 8 10 


1 1 112 30112 

120 114 63 100 63 

7 20 48 120 48 
Number\Weight 135 7 9 

16 1 4285 85 

112 5 33 90 90 


Example (3) Cosets of double-error-correcting BCH codes. Let € be a 
double-error-correcting BCH code with parameters [2" — 1,2” — 1 — 2m, 5] for 
odd m —3. It will be shown in $4 of Ch. 15 that @* has just 3 nonzero 
weights, namely 2"^' x 2^'? and 2"*', Therefore the annihilator polynomial 
(16) of € is 


LL m SA x = x uo 
a(x)=2 (1 2" mJ y) y) 


= > a;P;(x). 





For the present purpose we need only to find a3. The coefficient of x? in a(x) 
is 


22m 
e R” EU eee Pms )27 -1 


and the coefficient of x° in 


ro (5) (6) -6 
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is — 4. Therefore 
3 2? 3 


Og e Rye ay 
Thus in a coset of € with minimum weight 3, from (18) 


m] 





Therefore all cosets with minimum weight 3 have the same weight dis- 
tribution. Equation (18) also shows that all cosets must have minimum weight 
0, 1, 2 or 3. Since the code is double-error-correcting, there is one coset of 
weight 0, n of weight 1, and (2) of weight 2. The rest have weight 3. Thus € is 
a quasi-perfect code (see 85 of Ch. 1). 

For even values of m the double-error-correcting BCH code of length 
2" — ] is still quasi-perfect, as we shall see in 88 of Ch. 9, although the dual 
code now contains more than 3 nonzero weights. The following theorem 
explains why s' is called the external distance of the code. 


Theorem 21. For any vector f € F" there is a codeword at distance = s' from 


f. 


Proof. From (18), not all of the numbers A;(f) for i € s' can be zero. 
Q.E.D. 


Example. For the Hamming code s' — 1, and indeed since the code is perfect 
there is a codeword at distance <1 from every vector. 


Remark. We may define the covering radius of € to be 


t = max min dist (u, f). 
fEF" ue« 
Thus t is the true external distance of the code, i.e., t is the maximum of the 
smallest weight in any translate of €. Theorem 21 says that t x s'. But t may 
be less than s', as we see in Problem 11. 
Two other metric properties of a code are its diameter, 


6 = max dist (u, v), 


u.v€ € 


and radius, 


p = min max dist (u, v). 
uGF" veg 
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p can be obtained if we know the weight distributions of the translates of €, 
for p is the minimum of the largest weight in any translate. 


Problems. (10) Show that 
$<p<6. (20) 


(11) €) Show that t —-2" ^ —1 for the [n 22" — l, k = m, d -2"^'] simplex 
code. [Hint: use the Plotkin bound.] (ii) Show s'= n —4, and thus t < s’. (iii) 
Show p = 8 = 2"^!, (iv) Hence show there are infinitely many codes for which 
equality holds on the RHS of (20), and similarly for the LHS. 


Research Problem (6.2). Find bounds on t for any code, and a good method for 
calculating it. 


Theorem 22. Let € be an [n, k, d] binary linear code, and C = X,c«z". For 
tx [Xd — 1)] let D, = CY, representing the vectors which are at distance t from 
some codeword of €, i.e., the union of the cosets with coset leaders of weight t. 
Let f, be the number of vectors of weight s in D,. Then 


È fary = ge Dy AIP ACN + y= yY. Qu 


Proof. Write D, = Zer» daz”. Then by Problems 16 and 14 of Ch. 5, 


d, = F. D C D xa (C) Pt (u) 

m 

xL 

by Problem 13 of Ch. 5. Therefore the LHS of (21) is 


» dix" o0 ye) T st > 2, (-D" *B.(wt (u))x" "yw o, (22) 


ve F^ uc«! ver 


On the other hand the RHS is 





r È C D' "Put Q0) 


1 
T D, PAO (Ua + yy = yy 
which is equal to (22) by Equations (9) and (11) of Ch. 5. Q.E.D. 
Problem. (12) Suppose a code has minimum distance d = 2e + 1, and consider 


the incomplete decoding algorithm (85 of Ch. 1) which corrects all error 
patterns of weight «t and no more, for some fixed t «e. Show that the 
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probability of correct decoding is 


r 


2 () p'(1— p)", 


and that the probability of the decoder making an incorrect decision is 


Y X hsd-py- 


(70 s-i1 
$7. Designs from nonlinear codes when s' < d 


If the code is nonlinear, the case s' « d is noi covered by Corollary 14. The 
following weaker result applies to this case. 


Theorem 23. Let € be an (n, M, d) code. If d—s' s' « d, the codewords of 
weight d in € form a 


(d - s’) (n,a, ite) (23) 


design. 


Proof. Let f be a vector of weight d—s’. If d —s'« s', in the translate € + f 
we have 


Aq) = iiim Aaea) = Aasa) E +++ = Asif) = 0, 
Aa-s(f) = 1, 
and A,(f) is the number of codewords of weight d which cover f. By (18), 
AQ) = Aes, 
which is independent of the choice of f. If s' « d — s' (i.e., s' - 5d), then 
Adf) = = Af) =0, o A) = 1. 
In this case A.(f) — 1 is the number of codewords of weight d which cover f, 


and 


A.(f) ES l = ios 


Qs 


which is also independent of f. In either case then, the number of codewords 
of weight d which cover f is the same for all f of weight d — s', so these 
codewords form a 


(d — s) — (n, d (1 — a, .,)/a,) design. Q.E.D. 
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Theorem 24. If d — s' « s' < d, the codewords of any fixed weight win € form a 
(d — s')-design., 


Proof. Let f have weight d — s'. The theorem is true for codewords of weight 
d by Theorem 23. The number of codewords of weight d + 1 which cover f is 
A, (f). Since A. (f) is independent of the choice of f by Theorem 20, the 
theorem is true for w- d 4 I. 

We may write A,.;(f) ^ T, * T», where T, is the number of codewords of 
weight d +2 which cover f, and T, is the number of codewords of weight d 
which cover exactly d — s'— 1 ones of f. T; is determined by Theorem 23, and 
is independent of the choice of f. Therefore T, is also independent of the 
choice of f, and so the theorem is true for w = d + 2. Clearly we may continue 
in this way. Q.E.D. 


Theorem 24 is weaker than Corollary 14. For the codewords of weight 8 in 
the extended Golay code form a 5-design (by Corollary 14 or even by 
Theorem 9), but Theorem 24 only says they form a 4-design. 


Research Problem (6.3). Strengthen Theorem 24 for nonlinear codes. 


88. Perfect codes 


In this section we shall prove that an (n, M, d = 2e + 1) code is perfect if 
and only if s'— e. We conclude the section with an important necessary 
condition (Lloyd's theorem) for a code to be perfect. 


Theorem 25. For any code, s' z Xd — 1). 


Proof. Assume that € contains the codeword 0. First suppose d is odd and let v 
be a vector of weight Xd — 1). If s' «Xd — 1) then by Theorem 21 there is a 
codeword c at distance « Xd — 1) from v. But then wt (c) « d, a contradiction. 
Similarly if d is even. Q.E.D. 


Lemma 26. If s'—-Xd-—1) (d is odd) then ao(x)- PY(x)*t P(x)* 
t Paga). 


Proof. Let f be a vector of weight i «Xd — 1). In € ^ f we have 
AKf)—7:::— AL(f) 7*0, A(f) 1, 
Anf) 9: = Af) = 0. 
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Then by (18), a: = 1 for i=0, 1,..., s' — 1. If f has weight s' = Xd — 1), then 
Adf) =: E -= A,_(f)=0, A.) = l, 
so again a, — 1 from (18). Q.E.D. 


Theorem 27. An (n, M,2e +1) code € is perfect iff s' — e. 


Proof. Suppose s'- e. Then by Theorem 21 every vector is at distance «e 
from some codeword, which says that the spheres of radius e about the 
codewords fill F", and € is perfect. Conversely, suppose € is perfect, so that 
(26) of Ch. 5 holds. 

We apply Lemma 19 with B(x) = P.(x)+---+P.(x) and deduce that the 
annihilator polynomial a(x) of € must divide Po(x)+---+P.(x). Hence 


e = deg {Po(x)+---+ P.(x)} = deg a(x) = s*. 
But s' ze by Theorem 25. Q.E.D. 


Thus the annihilator polynomial of a perfect code is 
a(x) = Pix): P.(x). 


This is called Lloyd’s polynomial and is denoted by L.(x). By Problem 42 of 
Ch. 5, L.(x) = P.(x ^ 1; n — D). It follows that: 


Theorem 28. (Lloyd.) If there exists a binary (n, M,2e +1) perfect code then 
L,(x) has e integer zeros 9,..., c, satisfying 


0co,c:::«o,«n. 


Example. For the [7, 4, 3] Hamming code, L,(x) = 8 —2x. 


$9. Codes over GF(q) 


Most of this chapter can be generalized to codes over GF(q). To begin 
with, the parameters d, s, d', s' are defined as in 82 with weight and distance 
replaced respectively by Hamming weight and Hamming distance. 

Equation (1) now reads 


1+ > By" = Mq™" 2 Bult yy) (0 — yy, 
j^ Dar 


where y = q-— 1, and Theorem 2 becomes 
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Theorem 2.* If s « d', then 


Beh d. 5) [et 
| Ilz^ xà t Hz. 
i j*i 


for 1siss, where N = q"/M. 


Theorem 4 is changed similarly, while Lemma 1 and Theorems 3, 5, 6 are 
unchanged. 

Since there is no longer a unique vector of weight n, we cannot introduce §, 
and there is no analog of Theorem 7. 


t-Designs from nonbinary codes. If € is a linear code of length n over GF(q), 
we shall try to get a t-design from the codewords of weight w, w >O, in the 
following way. The set of w coordinates where a codeword is nonzero is 
called the support of the codeword. Of course (since the code is linear) the 
q-—1 scalar multiples of this codeword have the same support. Then this 
support (counted once, not q— 1 times) will form a block of the t-design. 

The t-design may still contain repeated blocks. However this will not be so 
for binary codes, nor for codes over GF(q) if w is equal to the minimum 
weight d of the code. For suppose c and c' are codewords of weight d with 
the same support which are not scalar multiples of each other. Then there is a 
scalar multiple of c, tc say, such that 0 < dist (tc, c')< d. a contradiction. 

For example, consider code #6 of Ch. 1. The codewords 0121 and 0212 
both have support 0111. Thus the 8 codewords of weight 3 give the four 
blocks 


olll, 1011, 1101, 1110, 


forming a trivial 3-(4, 3, 1) design. 
A sufficient condition for this construction to produce a t-design is given 


by: 


Theorem 29. (Assmus and Mattson.) Let € be an [n, k, d] linear code over 
GF(q). Suppose we can find an integer t, with 0 « t « d, such that there are at 
most d—t a,js in the range 1 «o, «n — t. (Here the o;/'s are the weights 
occurring in €*, as in 82.) Then the supports of the codewords of weight d in 
€ form a t-design. 


Proof. Let T be any set of t coordinate places, and (as in 85) let €' be formed 
by deleting those coordinates from €. Thus €' is an [n — t, k, d — t] code. The 
dual code is (€")' = (€7)'^'. By hypothesis (@")* has = d - t nonzero weights 
and its dual has minimum distance d - t. So we may apply Theorem 2* to 
obtain the weight distribution of (€")', and hence by Theorem 13 of Ch. 5, 
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the weight distribution of €". In particular the number of codewords of 
weight d —t in €', a say, is independent of the choice of T. 

Each of these a codewords comes from a codeword in € with weight d, 
whose support contains T. Now (as we showed above) all codewords with 
weight d with the same support are scalar multiples of each other. Therefore 
the number of blocks which contain T is a/(q — 1), and is independent of the 
choice of T. Q.E.D. 


Example. The self-dual code #6 of Ch. 1 contains codewords of weight 0 and 
3 only. Thus n = 4, d = ø, = 3. Therefore we may take t = 2, and deduce from 
Theorem 29 that the supports of the codewords of weight 3 form a 2-design. 
In fact, as we saw earlier, they form a 3-design, so the conclusion of the 
theorem is not always the strongest possible result. 


Corollary 30. Assume the same hypotheses as Theorem 29, and q >2. Then 
for each weight 7, in the range d & vr, = vo, the supports of the codewords of 
weight v, in € form a t-design, where v, is the largest integer satisfying 


Proof. The definition of vo implies that if two codewords of weight 7; « vo have 
the same support, then they must be scalar multiples of each other. The proof 
then proceeds as in Theorem 29. Q.E.D. 


For example, the [12,6,6] ternary Golay code 4; (see Ch. 20) is a self-dual 
code with weights 0. 6, 9, 12. Then vo 11, so the supports of the codewords 
of weights 6 and 9 form 5-designs. 

In the binary case we can never have repeated blocks, therefore: 


Corollary 31. Let € be an [n, k, d] binary linear code. If the hypothesis of 
Theorem 29 is satisfied then the codewords of any weight rt; form a t-design. 


Example. For the [24, 12, 8] extended Golay code we may take t — 5, for there 
are d—t = 3 os in the range [1, 19], namely 8, 12. 16. Thus the codewords of 
each weight form a 5-design, as we have seen before. 


We have not been able to generalize Theorem 13, and state this as: 


Research Problem (6.4). Generalize Theorem 13 to GF(q). 





Ch. 6. §10. There are no more perfect codes 179 


Problem. (13) Show that Corollary 31 implies Corollary 14. 


Research Problem (6.5). Is Corollary 31 truly stronger than Corollary 14? 


Weight distribution of translates, and perfect codes. All the results of $86, 8 
hold for codes over GF(q). The proofs for the binary case used the group 
algebra introduced in Ch. 5. The proofs for codes over GF(q) are the same, 
but require a more elaborate notation for the group algebra, which we do not 
propose to give. 

For example, Lloyd’s theorem becomes: 


Theorem 32. (Lloyd.) If there exists an (n, M, 2e + 1) perfect code over GF(q), 
then the Lloyd polynomial 


L(x)  P(x;n) tis t PL(x in) 


=P(x-1; 0-1) 

É ; (x — Mn x 

£evsee€-uÜ t) e 

Zena- Jt Q4 
has e integer zeros o,,..., a satisfying 


0co,c«:-:«o,«n. 


Now P,(x;n) is given by Equation (53) of Ch. 5. 


Problem. (14) Let €. be an [n, k, 2e + 1] linear code over GF(q). Show that @ is 
perfect iff the codewords of weight 2e + 1 form an (e + 1) (n, 2e + 1, (q — 1») 
design. If € is perfect, show that the number of codewords of weight 2e + 1 in 
€ is 

(q — D'"G*) 


Ana 2e31 
+t) 


*§10. There are no more perfect codes 


Three types of perfect codes were discovered in the late 1940's: 

(i) The linear single-error-correcting Hamming codes. We have seen the 
binary version in $7 of Ch. 1. In Ch. 7 we shall construct the Hamming codes 
over GF(q). These have the parameters 


[n= zd pay m.d-3], 
q-1 
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(ii) The binary [23, 12,7] Golay code €, ($6 of Ch. 2, and Ch. 20). 

(ili) The ternary [11,6, 5] Golay code 4, (Ch. 20). 

As we shall see in Ch. 20, the parameters of the two Golay codes determine 
them uniquely: any code with these parameters must be equivalent to one of 
the Golay codes. But for single-error-correcting perfect codes the situation 1s 
different. In 1962 Vasilev constructed a family of nonlinear binary single- 
error-correcting codes with the same parameters as Hamming codes (see $9 of 
Ch. 2). Later Schónheim and Lindstróm gave nonlinear codes with the same 
parameters as Hamming codes over GF(q) for all q. This question is still not 
completely settled: 


Research Problem (6.6). Find all perfect nonlinear single-error-correcting 
codes over GF(q). 


Finally there are the trivial perfect codes: a code containing just one 
codeword, or the whole space, or a binary repetition code of odd length. 
Subsequently many people attempted to discover other perfect codes, or 
when this failed, to prove no others existed. Van Lint made considerable 
progress on this problem, which was finally Finn-ished by Tietäväinen in 1973. 
The final result is: 


Theorem 33. (Tietäväinen and Van Lint.) A nontrivial perfect code over any 
field GF(q) must have the same parameters n, M and d as one of the 
Hamming or Golay codes. 


We break this up into five parts, Theorems 37-41. The essence of the proof 
is to show that the only cases in which Lloyd's polynomial has distinct integer 
zeros in the range [1, n — 1] are those given above. 

Throughout $10 let € be an (n, M,2e + 1) perfect code over GF(q), where 
q =p", p — prime, and let o, «o0; «::: «€ g, be the integer zeros of L.(x) 
(Theorem 32). 

First some lemmas. 


Lemma 34. (The sphere packing condition.) The number of codewords M isa 
power of q, and 


AC) -= q' (25) 


for some integer l. 
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Proof. Since the code is perfect (cf. Theorem 6 of Ch. 1), 
1 n i n nr 
M (")(q-1 =a" - p". 
Therefore M = p! and 


> (^) (q— Do p", 


i20 


Thus q— 1 = p’ — 1 divides p"! —1. By Lemma 9 of Ch. 4, r divides j and M 
is a power of q. Q.E.D. 


Lemma 35. L.(0) = q'; and if € is nontrivial, L.(1) and L.(2) are nonzero. 


Proof. From (24), using (;') = (—1) (see Problem 18 of Ch. 1), 


L(- X«-w7(,* ;) 
=q' (from Lemma 34). 


Also from (24), 
L7 (ay (^7!) 


which is nonzero since e x n— 1; and 


L= 40 (^t Mam -e- n- n0. (26) 
e e-l 

which is zero only if q = 1+ e/(n — e — 1). But q > 2, so this implies n <2e +1, 
and 4 is trivial. Q.E.D. 
Lemma 36. 


artes o EOD Le D 


Tad: i- 
m0;::-0, — elg. 


Proof. From Theorem 15 of Ch. 5, L.(x) is a polynomial in x of degree e. The 


coefficient of x‘ is (— q)‘/e!. and the coefficient of x* ' is 


_ Cay fete c, q-1 -e] 
md 2 + ü e(n—e)}. Q.E.D. 
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Theorem 37. A perfect single-error-correcting code over GF(q) has the same 
Parameters n, M and d as the Hamming code. 


Proof. The sphere packing condition (Lemma 34) is 
1 (q — Dn = q', 
and Lloyd's theorem (Theorem 32) says that 
L(x) = Pox) + P(x) = 1+{(q — Un — gx} 
has an integer root x = g,. Thus 
I * (q —- Dn— qoi 7 0. 


Hence ø, = q*' and n = (q' — 1)/(q — 1). The size of this code is q"'' from the 
sphere packing bound. Q.E.D. 


Furthermore this perfect code has the same weight and distance dis- 
tribution as the Hamming code. This follows from the GF(q) version of 
Theorem 4, since d = 3 and s'=e=1. 


Theorem 38. There is no nontrivial binary perfect double-error-correcting 
code. 


Proof. The sphere packing condition says 


l+n+ (3) =7' 
or 
Qn * 1y =2'?-7. (27) 
Lloyd's polynomial is 2L,(x) = y? — 2(n + 1)y + 2'*', where y = 2x. Therefore by 
Lemma 35 this polynomial must have distinct roots y,=2o, and y= 20 


which are even integers greater than 4. Since y,y2= 2", y, = 2^ and y; = 2" for 
3<a<b. Then y,+ y;— 2? t 2* = 2(n +1), so (27) becomes 


(2° + 2° E 1? = 395037 = fanaa 9. 


Reducing both sides modulo 16 gives 1 =—7, a contradiction. Q.E.D. 


Theorem 39. The only possible parameters for a nontrivial perfect double- 
error-correcting code over a field GF(q), q = p', are the parameters of the 
ternary Golay code. 
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Proof. The sphere packing condition is now 
1+(q-1n+(q- 17 (5) =a’ (28) 
which implies 
2(q— l)n =q — 3+ Viq?—6q+1+4+8q'). 
From Lemma 36, 
O02: = 2q"? 


orig SEES D 
q 
Eliminating n gives 
q(o. + o2) = 1+ V(q?— 6q + 1- 8q'). Q9) 


Since a, o2>2 by Lemma 35, we have a. = p^, o.= 2p" for A, uz 1 and 
à + u = r(l —2). Substituting in (29), squaring and dividing through by q we 
obtain 

—2(p* + 2p") + q(p* *2p"Y = q—-6-8qQ'. (30) 


All the terms except — 6 contain a factor of p. Therefore p is 2 or 3. If p is 2, 
then q—6 must be divisible by 4 and so q=2, which is impossible by 
Theorem 38. 

Now suppose p = 3. If q = 3, (30) becomes 


—2(3*° +2- 3*)+3(3° +2-3*)?=—-34+8-3'"'. 
If | x3 this has no solutions. For l = 4 after dividing by 3 we have 
3" «4-33 $4.3 — 2.34. BI 1 8:377, 


Therefore u = 1, and so A = 2. This implies | = 5 and (from (28)) n = 11, which 
are the parameters of the ternary Golay code. 
Finally suppose q = 3’ with r 1. Equation (30) modulo 9 is 


7(3° +2-3")=3-q'' (mod9) 
which shows | > 1, so 
7(3° +2-3")=3 (mod 9), 
i.e., (multiplying by 4), 
3 42.3"23 (mod 9). 


Therefore A=1 and >l. 1+p=r(1—2) implies 3^ —q'". Then (30) 
becomes 
— 23 +2: 34) +3 B2- F =3'- 648-3" ", 


-4:34 48:03 434 cy 
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But now the RHS is divisible by a higher power of 3 than the LHS. 
Q.E.D. 


Remark. A computer search has shown that the only solutions to Equation 
(25) in the range n <1000, e <1000, q <100 are the trivial solutions, the 
parameters of the Hamming and Golay codes, and one more solution: 


n = 90, k = 78, e=2 when q=2. 


But Theorem 38 shows there is no code with the latter parameters. 


Research Problem (6.7). How close can one get: is there a [90, 77, 5] binary 
code? 


Theorem 40. There is no nontrivial perfect e-error-correcting code over GF(q), 
q =p', if q>2 and e»2. 


Proof. (i) We first show 20, <o.. If s is an integer, s = p*t, where (p, t) = 1, we 
define a, (s) = t. Clearly a,(s,52) = a,(s:)a, (sz), and a, (s) « s. From Lemma 36, 
0,0;::: 0, = e! q'*, so 


a,(o,)a, (02)  - a,(o.) = a, (e?) « et. 


So either two of the a,(o;)'s are equal, say a,(o;) = a,(o;), or else the numbers 
a,(o\),..., @(o.) are equal to 1,..., e in some order and pez 3. In the 
first case o; = p^t, o; = p?t, and if i<j, 20; € o; In the second case for some i 
and j, a,(o;)=1, a,(a;) - 2, a; = p^, o; — 2p?, and (interchanging i and j if 
necessary) 20; € o; Therefore 20, € 20; € 0; <o.. 

(ii) Next we show 


8 /o, o, V 
oo, «5 (%57). G1) 


Writing x = aoc, this becomes 
x<31+xy forxe2 


which is immediate. 


(ili) 
RESULTE 
0910» 9. = coeff. of x° in L(x) 
= q~* $ = ct J 
=q e! Bq p b) Pa 


>q‘(q—-l)'n(n-1)---(n-e+1), 
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taking only the term j — 0 in the sum, 


2q'(q-lX*n' ( - te 0), (33) 


(iv) We shall now use a weighted version of the arithmetic-mean geometric- 
mean inequality, which states that for real numbers x; => 0, and weights o; > 0 
satisfying 


we have 


N N 
IDs o. (34) 
Thus 


I] T: —-(mo.): (02+ ++ Te) 


<8 (2 zay (et e.a may" 
9 2 e-2 





by (ii) and (34), 


using (34) again. 


Thus we have an improved version of the arithmetic-mean geometric-mean 
inequality which applies to the o;’s. From Lemma 36, the last expression is 





8 f(n —eyKqa-D cual 

- + 

9 l 7 5 (35) 

<5q“(q — 1)'n* (36) 

after a bit of algebra using q > 2. Combining (33) and (36) gives 

.&e-nD.8 
l 2n <9: 

n «$e(e — 1). (37) 


(v) Since 





(q — D'(n - D): - (n - e) 


a (38) 


I[e-0-£Lo- 


must be an integer, we have 
p"|(n — Dn -2)-- -(n— e). 


Suppose a is the largest power of p dividing any of n — 1,..., n — e. What 
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can we say about the power of p dividing the product (n — 1): ...- (n— e)? 


This is at most 
A= p^ *elpicleip2le: -- 


and we must have A z p". Therefore 
pepaese) 
a2>re-|—|]—|—3|--++ zelr—- 
p P p-1 
er 


Thus p''*?! = q'**! divides n — i for some i, and so 
n> q”. (39) 





(vi) From (37), (39) 
37 « q'? <n «e(e — 1) 
which implies that e 11, q <8, and n «495. By the computer search 
mentioned above, there are no perfect codes in this range. Q.E.D. 


Theorem 4l. The only possible parameters for a nontrivial binary perfect 
e-error-correcting code with e 2 are those of the Golay code. 


Proof. From (26), 


IT. — 2) = e! L.(2)/2° 
=(n—-2)(n—3)+++(n-~e)(n—-2e- y. (40) 


Since (c, — 1Xa; — 2) is even, (38) and (40) imply 2*|(n — D(n - 2} >>- (n- 
eY(n —2e — 1), and hence n »2- 27, The rest of the proof follows that of 


Theorem 40, using (32) and (35) to get an upper bound on a analogous to (37). 
Q.E.D. 


A code with the same parameters as the [23, 12, 7] binary Golay code has 
the same weight and distance distributions as this code. This follows from 
Theorem 4, since d — 7 and s'— e = 3. Similarly for the ternary Golay code. 


Notes on Chapter 6 


Paige [1018] and Bose [174] were probably the first to obtain designs from 
codes, while Assmus and Mattson [37,41,47] were the first to give results 
similar to Theorems 9 and 23. Our treatment follows Delsarte [350—352]. 
Semakov and Zinov'ev [1180, 1181] study the connection between constant 
weight codes and t-designs. 
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$3. Theorems 2 and 4 appear to be new. 


$5. This section is based on Shaughnessy [1195]. 


$6. Most of this section follows Delsarte [351], although some of the proofs 
are different. An alternative proof of Theorem 21 can be found in Assmus and 


Mattson [41]. 
In Euclidean space of dimension n, Equation (20) can be improved to 


<p < ôy (3) 


- see Jung [702]. Problem 11 shows that no such improvement is possible in 
Hamming space. Theorem 22 is from MacWilliams [872]. 


$8. Theorems 25, 27 are from MacWilliams [871,872]. Problem 13 is due to 
Assmus and Mattson [37] (see also [47]). 

Theorem 28 is due to Lloyd [859]. For Theorem 32 and other generaliza- 
tions see Assmus and Mattson [41], Bassalygo [76a, 77], Biggs [144], Delsarte 
(380], Lenstra [815] and Roos [1123]. 

A code is said to be uniformly packed if s' — e+1 [505]. 


$9. Theorem 29 and Corollary 30 are due to Assmus and Mattson [41, 47]. 
Delsarte [351] has given a different generalization of Theorems 9 and 24 to 
codes over GF(q). 


$10. Van Lint [855] is an excellent survey article on perfect codes, and 
includes references to the early work (which we have omitted). Tietáváinen's 
work on perfect codes will be found in [1321-1324], and Van Lints in 
[845-850, 854, 855]. For some recent Russian work see [1471, 1472]. 

Schónheim [1159] (see also [1160]) and Lindstróm [842] show that for any 
q > 3 and m =2 there is a genuinely nonlinear perfect single-error-correcting 
code of length n = (q"*' — D/(q — 1) over GF(q). 

The proof of Theorem 33 given here follows Van Lint [848] and [854] and 
Tietäväinen [1323], with contributions by H.W. Lenstra and D.H. Smith 
(unpublished) in the proof of Theorem 41. The computer search used in 
proving Theorems 40 and 41 was made by Van Lint [845]; Lenstra and A.M. 
Odlyzko (unpublished) have shown that it can be avoided by tightening the 
inequalities. For the arithmetic-mean geometric-mean inequality see for ex- 
ample Beckenbach and Bellman [94, p. 13]. 

Generalizations of perfect codes. For perfect codes in the Lee metric, or 
over mixed alphabets, or in graphs, see Astola [54], Bassalygo [77], Baumert et 
al. [84]. Biggs [144-147]. Cameron et al. [236], Golomb et al. 
[524, 525, 529, 531, 532], Hammond [593, 594], Heden [628-630], Herzog and 
Schónheim [643], Johnson [698], Lindstróm [843]. Post [1072], Racsmány 
[1084], Schónheim [1159, 1160, 1162] and Thas [1319]. 





Cyclic codes 


§1. Introduction 


Cyclic codes are the most studied of all codes, since they are easy to 
encode, and include the important family of BCH codes. Furthermore they 
are building blocks for many other codes, such as the Kerdock, Preparata, 
and Justesen codes (see later chapters). 

In this chapter we begin by defining a cyclic code to be an ideal in the ring 
of polynomials modulo x" — 1 (82). A cyclic code of length n over GF(q) 
consists of all multiples of a generator polynomial g(x), which is the monic 
polynomial of least degree in the code, and is a divisor of x" — 1 (83). The 
polynomial h(x) = (x" — 1)/g(x) is called the check polynomial of the code 
(84). 

In 85 we study the factors of x" — 1. We always assume that n and q are 
relatively prime. Then the zeros of x" — 1 lie in the field GF(q"), where m is 
the least positive integer such that n divides q” — 1. 

At the end of 83 it is shown that Hamming and double-error-correcting 
BCH codes are cyclic. Then in 86 we give the general definition of t-error- 
correcting BCH codes over GF(q). In $7 we look more generally at how: a 
matrix over GF(q") defines a code over GF(q). The last section describes 
techniques for encoding cyclic codes. 

Further properties of cyclic codes are dealt with in the next chapter. 


$2. Definition of a cyclic code 


A code € is cyclic if it is linear and if any cyclic shift of a codeword is also 
a codeword, ie. whenever (Co,Ci,...,€n-1) is in € then so is 
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(Cai, Co... €4-2). For example, the Hamming code (40) of Ch. 1 is cyclic. So 
is the code €, = (000, 110, 101, O11}. 

To get an algebraic description, we associate with the vector c= 
(Co, C... C.) in F” (where F is any finite field GF(q)) the polynomial 
C(x) = cot CX tt cx". E.g. €, corresponds to the polynomials 0, 
I*x I-xx-cx. 

We shall use the following notation. If F is a field, F[x] denotes the set of 
polynomials in x with coefficients from F. 

In fact, F[x] is a ring. 


Definition. A ring R (loosely speaking) is a set where addition, subtraction, 
and multiplication are possible. Formally, R is an additive abelian group, 
together with a multiplication satisfying ab = ba, a(b +c) = ab + ac, (ab)c = 
a(bc), and which contains an identity element 1 such that 1 - a = a. [Thus our 
ring is sometimes called a commutative ring with identity.] 


The ring R, = F[x]/(x" — 1). For our purposes another ring is more important 
than F[x]. This is the ring R, = F[x]/(x" — 1), consisting of the residue classes 
of F[x] modulo x" — 1. Each polynomial of degree <n — 1 belongs to a different 
residue class, and we take this polynomial as representing its residue class. 
Thus we can say that c(x) belongs to R,. R, is a vector space of dimension n 
over F. 


Multiplying by x corresponds to a cyclic shift. If we multiply c(x) by x in R, we 


get 
XC(X) = cox + CX? t - HC 2X + Cai" 


n-i 


= Cn- + CoX tcc t 06. X , 


since x^ - 1 in R,. But this is the polynomial associated with the vector 
(Cnty Co... , C42). Thus multiplying by x in R, corresponds to a cyclic shift! 


Ideals. 


Definition. An ideal € of R, is a linear subspace of R, such that: 

(i) if c(x) € € then so is r(x)c(x) for all r(x) € R.. Clearly (i) can be 
replaced by: 

(ii) if c(x) € € then so is xc(x). 


Our initial definition can now be simply written as: 


Definition. A cyclic code of length n is an ideal of R,. 
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Example. €, = {0,1 +x, 1+x?,x - x^) is an ideal in R}. For €; is closed under 
addition (hence linear), and any multiple of c(x) € €; is again in €. (e.g., 
x(19- x) 2 xi -* x! 2 x! x, since x = x* in R3). 


The group algebra FG. A second description of R, is often helpful. Let 
G -(1,x, x?,...,x""'}, x" =1, bea cyclic group of order n. The group algebra 
FG of G over F, (cf. 83 of Ch. 5), consists of all formal sums 


n-! 
cx)- Y cx, GEF. 
1-0 


Addition in FG is by coordinates, and multiplication is modulo x" — 1. Clearly 
FG coincides with R,. 


Problems. (1) What is the ideal describing the cyclic code {0000, 0101, 1010, 
1111}? 

(2) Describe the smallest cyclic code containing the vector 0011010. 

(3) Show that R, is not a field (Hint: x — 1 has no multiplicative inverse). 

(4) Show that a polynomial f(x) has a multiplicative inverse in R,. if and 
only if f(x) is relatively prime to x"— 1 in F[x]. 

(5) Show that in an [n, k] cyclic code any k consecutive symbols may be 
taken as information symbols. 


$3. Generator polynomial: 


A particularly simple kind of ideal is a principal ideal, which consists of all 
multiples of a fixed polynomial g(x) by elements of R,. It will be denoted by 


(g(x)). 


g(x) is called a generator polynomial of the ideal. 

In fact every ideal in R, is a principal ideal; every cyclic code has a 
generator polynomial. The next theorem proves this and other basic proper- 
ties of cyclic codes. 


Theorem 1. Let € be a nonzero ideal in R,, i.e., a cyclic code of length n. 

(a) There is a unique monic polynomial g(x) of minimal degree in «€. 

(b) € = (g(x)), i.e., g(x) is a generator polynomial of €. 

(c) g(x) is a factor of x" — 1l. 

(d) Any c(x) € € can be written uniquely as c(x) = f(x)g(x) in F[x], where 
f(x) © F[x] has degree «n -r,r = deg g(x). The dimension of € is n — r. Thus 
the message f(x) becomes the codeword f(x)g(x). 

(e) If g(x) = Bot gix t: gX', then € is generated (as a subspace of F”) 
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by the rows of the generator matrix 


8o 8: 82 *** £g. 0 


Ge 8o 8:1 *** 8-1 &, 
0 go aaa ae gr 
g(x) ] 
= xg(x) (1) 
NN 


using an obvious notation. 


Proof. (a) Suppose f(x), g(x) € € both are monic and have the minimal degree r. 
But then f(x) — g(x) € € has lower degree, a contradiction unless f(x) = g(x). (b) 
Suppose ¢(x) € €. Write c(x) = q(x)g(x) * r(x) in R,, where deg r(x) < r. But 
r(x) = c(x) - q(x)g(x) € € since the code is linear; so r(x) 2 0. Therefore 
c(x) € (g(x)). (c) Write x" — 1 = h(x)g(x) + r(x) in F[x], where deg r(x) <r. In 
R, this says r(x) = — h(x)g(x) € €, a contradiction unless r(x) ^ 0. (d), (e): 
From (b), any (c)x € €, deg c(x) < n, is equal to q(x)g(x) in Ra. Thus 


c(x) = q(x)g(x) + e(x)(x" — 1) in F[x], 
= (q(x) + e(x)h(x))g(x) in F[x], 
= f(x)g(x) in F[x], (2 


where degf(x) «n—r-—1. Thus the code consists of multiples of g(x) by 
polynomials of degree <n —r-—1, evaluated in F[x] (not in R,). There are 


n-r linearly independent multiples of g(x), namely 
g(x),xg(x),...,x""'g(x). The corresponding vectors are the rows of G. 
Thus the code has dimension n - r. Q.E.D. 


We next give some examples of cyclic codes. 


Binary Hamming codes. Recall from $7 of Ch. 1 that the parity check matrix 
of a binary Hamming code of length n —2"-— 1 has as columns all 2"— 1 
distinct nonzero m-tuples. Now if @ is a primitive element of GF(2") (82 of 
Ch. 3, 82 of Ch. 4) then 1, a, à7,..., 67^? are distinct and can be represented 
by distinct nonzero binary m -tuples. 

So the binary Hamming code Xm with parameters 


[n = 2"-1,k =n—m,d =3] 
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has a parity check matrix which can be taken to be 
H =(1,a,a@’,...,a°"”), (3) 


where each entry is to be replaced by the corresponding column vector of m 
0’s and 1’s. 
E.g. for 26,, 
H = (1, a, a°, a’, a^, a, a5) 
(nmn I ] 
0101110 
~ M001011/* (4) 


where a € GF(2’) satisfies a’ a +1=0. 
A vector € = (Co, C1,...,Cn-1) belongs to Km 


iff Hc" = 0 
n-i 
iff > c;a' =0 
izo 
iff cla) =0 


where c(x) = Cot cix +*+ c, ax". From property (M2) of Ch. 4, c € %,, iff 
the minimal polynomial M'"(x) divides c(x). Thus 3€, consists of all multiples 
of M(x), or in other words: 


Theorem 2. The Hamming code 26, as defined above is a cyclic code with 
generator polynomial g(x) = M''(x). 


From Theorem 1 a generator matrix for Hm is 


M(x) 
xM (x) 

G= x M(x) (5) 

x" "^" M(x) 

E.g. for X, 
1101 
1101 

= 6 
G 110 (6) 


Problem. (6) Verify that the rows of (4) are orthogonal to those of (6). 


Double-error-correcting BCH codes. In Equation (11) of Ch. 3 a double-error- 
correcting code € of length n = 2” —1 was defined to have the parity check 
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matrix 
la o? a”? 
H = ( à at des RO (7) 


where again each entry is to be replaced by the corresponding binary m-tuple. 
Now 


ce € iff Hc’ =0 
iff b cia! —- 0 and b ca? — 0 
iff eta) —0 and cla?) =0 
iff M(x) | c(x) and M(x)|c(x), by (M2), 
where Mx) is the minimal polynomial of a’, 
iff l.c.m. (M?(x), M?(x)) | c(x). 


But M(x) and M®(x) are irreducible (by (M1)) and distinct (by Problem 15 
of Ch. 4), so finally we have 
c € € iff M'"(x)M?Xx) | c(x). 


Thus we have proved: 


Theorem 3. The double-error-correcting BCH code € has parameters 
[n =2"-1,k=n-2m,d25], m3, 
and is a cyclic code with generator polynomial 
g(x) = M”(x)M®”(x). 


Proof. deg ,(x) = 2m follows from Problem 15 of Chapter 4. The minimum 
distance was established in Ch. 3 (another proof is given by Theorem 8 
below). Q.E.D. 


Problem. (7) Show that the double-error-correcting BCH code of length 15 
given in $3 of Ch. 3 has generator polynomial  g(x)- 
(x* x t D(x* x? - x? * x +1) (use $4 of Ch. 4). Give a generator matrix. 


Remark. So far nothing has been said about the minimum distance d of a 
cyclic code. This is because in general it is very difficult to find d. The BCH 
bound (Theorem 8 below) will give a lower bound to d if the zeros of g(x) are 
known. In Ch. 8 we will see that in some cases the Mattson-Solomon 
polynomial enables one to find d. 


Problems. (8) Nonbinary Hamming codes. (a) The Hamming code 26, (q) over 
GF(q) has an m x(q” — l)/(q— 1) parity check matrix whose columns are 
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all nonzero m-tuples from GF(q) with first nonzero entry equal to 1. Code #6 
of Ch. 1 shows #3). Prove that 26,(q) is a perfect [n = (q" — D/(q — 1), 
k-n-m,d-3] code. 

(b) If m and q - 1 are relatively prime prove that 26,,(q) is equivalent to the 
cyclic code with zeros a, a^, a*,... where a € GF(q") is a primitive n'^ root 
of unity. 

(9) Shortened cyclic codes. Engineering constraints sometimes call for a 
code of a length for which there is no good cyclic code. In this case a 
shortened cyclic code €* can be used, obtained by taking those codewords of 
a cyclic code € which begin with i consecutive zeros, and deleting these 
zeros. The resulting code is of course not cyclic. However, show that there is 
a polynomial f(x) such that €" is an ideal in the ring of polynomials mod f(x), 
and conversely, any ideal in such a ring is a shortened cyclic code. 


$4. The check polynomial 
Let € be a cyclic code with generator polynomial g(x). From Theorem 1, 
g(x) divides x" — 1. Then 
h(x) = (x^ - Iigtx) 
k 
=> hx (say), (hz 0) 
i-o 


is called the check polynomial of €. The reason for this name is as follows. If 
c(x) = 2 cx! = f(x)g(x) 


is any codeword of €, then 


c(x)h(x) = 5 cx! > hjx! 


= f(x)g(x)h(x) 
= 0in Rn. 


The coefficient of x’ in this product is 


ni 


3X ch = 0, j=0,1,...,n-1, (8) 


i=0 


where the subscripts are taken modulo n. Thus the Equations (8) are parity 
check equations satisfied by the code. Let 
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hy Vra hı hi ho 
H = hcc hh ho (9) 


Pe 
x" h(x) 


using an obvious notation. Then (8) says that if c E € then Hc’ =0. Since 
k = deg h(x) 2 n — deg g(x) = dimension €, 


and the rows of H are obviously linearly independent, the condition Hc" = 0 
is also sufficient for c to be in the code. Thus H is a parity check matrix for 
€. 


Example. For the Hamming code 3 h(x)=(’+D/OC+x+ I= 
(x + DG -x^*1)2 x*- x^- x1. Thus 


l 
l (10) 


This is the same as (4). 


Problem. (10) For the code of Problem 7, give h(x), H, and verify that this 
matrix defines the same code as Equation (10) of Ch. 3. 

Note that Equation (8) says that the codeword c must satisfy the parity 
check equations 


Cn-k-iħhk + Cut +*+ + + cV alio = 0, 
Cn-k-2hk + Cn-x-iħhk-1 +++ + c Shy = 0, 


E TT de asters oa E E EAA (11) 
Coh + Cia +° ++ + cho — 0. 


Le., c satisfies the linear recurrence 
ch, Tt Cri Ages Tcr Craxho =0 (12) 


for 0xt«n-kc-1]. Thus if c, ,..., c, .. are taken as message symbols, 
Equations (12) successively define the check symbols Cn-k-1,..., Co (since 
hy, # 0). 
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The dual code. Let € be a cyclic code with generator polynomial g(x) and 
check polynomial h(x) = (x^ — 1)/g(x). 


Theorem 4. The dual code €* is cyclic and has generator polynomial 


g'(x) — x98 COR (x 7!) 
Proof. From Equation (9). Q.E.D. 


By this theorem the code with generator polynomial h(x) is equivalent to 
€*. In fact it consists of the codewords of €^ written backwards. 


Problem. (11) Show that the [7,4,3] code with g(x)=x°+x+1 and the 
[7, 3, 4] code with g(x) 2 x**- x? - x4 1 are duals. 


85. Factors of x" —1 


Since the generator polynomial of a cyclic code of length n over GF(q) 
must be a factor of x"—1, in this section we study these factors, and 
introduce the splitting field of x" — 1. 

We assume always that n and q are relatively prime. (Thus n is odd ín the 
binary case.) By Problem 8 of Ch. 4 there is a smallest integer m such that n 
divides q” — 1. This m is called the multiplicative order of q modulo n. By 
Problem 11 of Ch. 4, x" — 1 divides x^" ' — 1 but does not divide x^^ ' —1 for 
O<s<m. 

Thus the zeros of x"—1, which are called n' roots of unity, lie in the 
extension field GF(q") and in no smaller field. 

The derivative of x” — 1 is nx" ', which is relatively prime to x" — 1, since n 
and q are relatively prime. Therefore x" — 1 has n distinct zeros. 


Factoring x"—1 over GF(q") Thus there are n distinct elements 
Qo, Qis. a On- In GF(q") (the n^ roots of unity) such that 


n-i 
x"-1=[] (x-«). 
t-0 
GF(q") is therefore called the splitting field of x" — 1. 
Problems. (12) Show that the zeros of x"—1 form a cyclic subgroup of 


GF(q")*. Le., there is an element a in GF(q"), called a primitive n™ root of 
unity, such that 
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n-i 
x"-1-2 [[6- 25. (13) 
t=O 
Throughout this chapter a will denote a primitive n'^ root of unity. 
(13) Show that the zeros of x" — 1 form the multiplicative group of a field iff 
nq" -]. 


Factoring x" — 1 over GF(q). Cyclotomic cosets mod p" — 1 were defined in 
$3 of Ch. 4. More generally the cyclotomic coset mod n over GF(q) which 
contains s is 
C, = (5.5q, 505, ..., sq"), 
where sq": = s mod n. (It is convenient but not essential to choose s to be the 
smallest integer in C.) Then the integers mod n are partitioned into cyclo- 
tomic cosets: 
{0,1,...,2-I}=UC,, 


where s runs through a set of coset representatives mod n. Note that m = m, is 
the number of elements in C,. 
E.g. for n =9, q =2, 


Co = {0}, 
C, = ü. 2, 4, 8, 7, S}, 
C; = {3, 6}. 


Thus m =6, and x?— 1 splits into linear factors over GF(2°). 
Then as in Ch. 4 (see especially Problem (14)) the minimal polynomial of 
a’ is 


M(x) = [T6 - 25. 


rec, 


This is a monic polynomial with coefficients from GF(q), and is the lowest 
degree such polynomial having a^ as a root. 
Also 
x^-1 - [I Mx) (14) 


s 


where s runs through a set of coset representatives mod n. This is the 
factorization of x" — 1 into irreducible polynomials over GF(4). 
E.g. n =9, q =2: 
x? +1 = M?Xx)M'(x)M?X(x), 

where 

M®”(x)=x+l], 

M(x) = x+ x°’ +1, 

M(x) =x? 4x41. 
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Figure 7.1 gives the factors of x" + 1 over GF(2) for n « 63 and n = 127. Of 
course x^" 4 ] 2 (x" 1 (Lemma 5 of Ch. 4), so only odd values of n are 
given. Also for n = 3, 5, 11, 13, 19, 29, 37, 53, 59, 61,... the factorization is 
x" +1=(x+1)(x""'+-+++x+41), since for these primes there are only two 
cyclotomic cosets, C, and C,. The factors are given in octal, with the lowest 
degree terms on the left. Thus the first line of the table means that 


1tx7=C4¢ x) 9 x? - x10 xx). 


Problems. (14) Show that if s is relatively prime to n, then C, contains m 


elements. 
(15) Let 
f(x) = [[6 - a5. 


where K is a subset of (0, 1,...,n — 1}. Show that f(x) has coefficients in 
GF(q) iff k € K 2 qk (mod n) € K. 


Factors (in octal, lowest degree on left) 


n 
7 6.54.64. 
9 6.7.444. 
15 6.7.46.62.76. 
17 6.471.727. 


21 6.7.54.64.534.724. 

23 6.5343.6165. 

25 6.76.4102041. 

27 6.7.444.4004004. 

31 6.45.51.57.67.73.75. 

33 6.7.4522.6106.7776. 

35 6.54.64.76.57134.72364. 

39 6.7.57074.74364.77774. 

41 6.5747175.6647133. 

43 6.47771.52225.64213. 

45 6.7.46.62.76.444.40044.44004. 

47 6.43073357.75667061. 

49 6.54.64.40001004.40200004. 

51 6.7.433.471.637.661.727.763. 

55 6.76.7776.5551347.7164555. 

57 6.7.5604164.7565674.7777774. 

63 6.7.54.64.414.444.534.554.604.634.664.7 14.724. 
127 6.406.422.436.442.472.516.526.562.576.602.626.646.652. 

712.736.742.756.772. 


Fig. 7.1. Factors of 1 x^ over GF(2). 
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(16) Show that if 
h(x) - [I6 7 a) 


is a divisor of x" — 1 over GF(q). then 


x“ h(x) = constant: [[(x — a7’). 
i 


(17) Show that properties (MI)-(M6) given in Ch. 4 hold for this more 
general definition of minimal polynomials. 


The Zeros of a code. Let € be a cyclic code with generator polynomial g(x). 
Since g(x) is a divisor of x" — 1 over GF(q) we have 


g(x) = I[ 6-25. (15) 


where (by Problem 15) i € K > qi (mod n) € K. K is a union of cyclotomic 
cosets. The n^ roots of unity (a': i € K} are called the zeros of the code. 
(Naturally the other n' roots of unity are called the nonzeros — these are the 
zeros of h(x) = (x^ — D/g(x).) 

Clearly if c(x) € R, then vm belongs to © iff ca‘ h= 0 for all i E K. Thus 
a cyclic code is defined in terms of the zeros of c(x). By Problem 16, the 
zeros of the dual code are the inverses of the nonzeros of the original code. 
Le. if € has zeros a' where i runs through Cu, C,,,..., then €* has nonzeros 
a! where j runs through C-u, C ,,,.... 

Up to now we have taken the generator polynomial of a code to be 
g(x) = the lowest degree monic polynomial in the code. But other generators 
are possible. 


Lemma 5. If p(x) R. does not introduce any new zeros, i.e., if plai) #0 for 
all i€ K, then g(x) and p(x)g(x) generate the same code. (E.g. g(x} generates 
the same code as g(x).) 


Proof. Clearly (g(x) 2 (pGOg(x)). By hypothesis p(x) and h(x) are relatively 
prime, so by Corollary 15 of Ch. 12 there exist polynomials a(x), b(x) such 


that 


1 = a(X)p(x) * b(x)h(x) in F[x]. 
Q(X) = a(x)p(x)g(x) + b(x)g(x)h(x) in F[x]. 
= a(x)p(x)g(x) in Rn 


(g(x)) € (p(x)g(x)). Q.E.D. 





200 Cyclic codes Ch. 7. §5. 


Problem. (18) If € = (f(x) let K = {i,0<i<n-— 1: f(a‘) = 0). Show that € = 
(g(x) where 
g(x) =1.c.m.{M(x): ie K}. 


Lemma 6. Let ë € GF(q") be any zero of x" — 1. Then 


wa f0 ifé¥1 
Semi if £- 1. 


Proof. If £ = 1, the sum is equal to n. Suppose £z 1. Then 


2 e-a-60-0-0. Q.E.D. 


Lemma 7. An inversion formula. The vector c = (Co, Ci,...,Cn-1) may be 
recovered from c(x) = Cot cix +-+- c ax"! by 


a= L X c(a!)a ?, (16) 


Proof. 


n-i : á nzi n_i PA 
iata -i) 
2 c(a’)a 2 È oe 


j- -9 k= 


n-i n-i 
- € c 9, a'* - nc by Lemma 6. Q.E.D. 
k=0 j-0 


Problems. (19) Let A be the following Vandermonde matrix over GF(2") (cf. 
Lemma 17 of Ch. 4): 


Qo ay Qn-1 
A zm 2 2 
=f Qo Qi Qn-1 , 
n-i =q 1 
Qo a Qn-1 


where the a; are the n™ roots of unity. Show that det (A) = 1 and 
si tee agit? 


a1 see Do-0 
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(20) Let c(x) 2 cot cix t: t c, ax" "ER, and 


Co Cy Cn-1 
M = Ca-1 Co Cn-2 
Cy C2 Co 


Show that AMA = diag [c(a9, ca,),..., C(an-,)]. (See also Problem 7 of 
Ch. 16.) Hence give another proof that dim € = n — deg g(x). 

(21) Show that 1 is in a cyclic code iff g(1) #0. Show that a binary cyclic 
code contains a codeword of odd weight iff it contains 1. 


$6. t-Error-correcting BCH codes 


Theorem 8. (The BCH bound.) Let € be a cyclic code with generator 
polynomial g(x) such that for some integers b 20, 6z 1, 


(a^) = g(a^*') = +> = gla?) - 0. 


Le. the code has a string of 6 — 1 consecutive powers of a as zeros. Then the 
minimum distance of the code is at least 6. 


Proof. If c —(c,0,...,c,-) is in € then 
cla’) = c(a**) =- > = c(a***7)— 0, 


so that H'c" = 0 where 


b - 
l æ? a? e. gn ib 
b b+i —iXb+i! 
H'- 1 at^! o" tt e. a” Mb +1) 
1 r a eee gt he +8-2) 


(Note that H’ need not be the full parity check matrix of €.) The idea of the 
proof is to show that any 6— 1l or fewer columns of H' are linearly inde- 
pendent over GF(q"). Suppose c has weight w =ô- 1, ie., c,z0 iff i€ 
{a,, a2,..., a}. Then H'c* = 0 implies 


b 
ae’ ome a’ é 
aíG 1) get? a 
m P |= 0. 
ab+w-) at» Cs, 


Hence the determinant of the matrix on the left is zero. But this determinant 
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is equal to a/^*' +e times 


which is a Vandermonde determinant and is nonzero by Lemma 17 of Ch. 4, a 
contradiction. Q.E.D. 


Examples. The binary Hamming code 2€, has generator polynomial M‘(x) 
(see $3). Now M"'(a) = M" (a?) = 0 (by (M6) of Ch. 4), so the code has a pair 
of consecutive zeros, a and a”. Hence by Theorem 8 the minimum distance is 
z3. 

The binary double-error-correcting BCH code has g(x) = M(x)MC(x). 
Now M (a) = M (a?) = M(a*) = 0 and M®(a*)=0. Therefore there are 4 
consecutive zeros: a, a’, a*,a*; and thus d > 5, in agreement with Ch. 3. 

The minimum distance of € may in fact be greater than 5, for more than 
6—1 columns of H' may be linearly independent when the entries are 
replaced by m-tuples from GF(q). 


Corollary 9. A cyclic code of length n with zeros a’, «**', a* ?', ,,,, a? * 0v. 


where r and n are relatively prime, has minimum distance at least 6. 


Proof. Let 8 = a’. Since r and n are relatively prime, f is also a primitive n™ 
root of unity. Therefore a’ — 8' for some f, and the codes has zeros 
B', B'*,..., BY? ?. The result follows from the proof of Theorem 8 with B 
replaced by a. 


BCH codes. 


Definition. A cyclic code of length n over GF(q) is a BCH code of designed 
distance 8 if, for some integer b = 0, 


g(x) = L.c.m. {M@(x), M^'"(x),..., M***?Xx)). (17) 

Le. g(x) is the lowest degree monic polynomial over GF(q) having 
a^, a**',..., 0^ **? as zeros. Therefore 

c is in the code iff c(a^) 2 c(a*^*) =- -= c(a???) = 0. (18) 


Thus the code has a string of ó — 1 consecutive powers of a as zeros. From 
Theorem 8 we deduce that the minimum distance is greater than or equal to 
the designed distance 6. 

Equation (18) also shows that a parity check matrix for the code is 
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1 a? a?’ ata aft 
1 a! aX*» MEE a n X6» 

H- ‘ (19) 
1 ot? E a" 9*5» 


where each entry is replaced by the corresponding column of m elements 
from GF(q). 

After this replacement the rows of the resulting matrix over GF(q) are the 
parity checks satisfied by the code. There are m(6 — 1) of these, but they need 
not all be linearly independent. Thus the dimension of the code is at least 
n — m(8 — 1). For a second proof, from (M4) of Ch. 4 deg M?(x) « m, hence 
deg g(x) = n — dimension of code <m(6 — 1). Thus we have proved: 


Theorem 10. A BCH code over GF(q) of length n and designed distance 8 has 
minimum distance d > ô, and dimension zn — m(6 —1). (m was defined in 


85.) 


The dimension will be greater than this if some of the rows of the GF(q) 
version of H are linearly dependent, or (equivalently) if the degree of the 
RHS of (17) is less than m(8 — 1). Examples of this are given below. 

A generator matrix and an alternative form for the parity check matrix are 
given by Equations (1) and (9) respectively. 


Remarks. (1) If b — 1 these are sometimes called narrow sense BCH codes. If 
n = q" — ] they are called primitive, for then a is a primitive element of the 
field GF(q") (and not merely a primitive n™ root of unity). 

If some a' is a zero of the code then so are all a’, for | in the cyclotomic 
coset C, Since the cyclotomic cosets are smallest if n = q" — 1, this is the 
most important case. 

(2) If b is fixed, BCH codes are nested. Le., the code of designed distance 
6, contains the code of designed distance ô iff ô, = ô- 

(3) In general the dual of a BCH code is not a BCH code. 


Binary BCH codes. When q — 2, by property (M6) of Ch. 4, 
M(x) = M(x), 


and so the degree of g(x) can be reduced. For example if b — 1 we may 
always assume that the designed distance ô is odd. For the codes with 
designed distance 2t and 2t + 1 coincide - both have 


g(x) 2 I.c.m. {M(x), M?(x),..., MeO POD} (20) 


Thus deg g(x) < mt, and the dimension of the code is =n — mt. The parity 
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check matrix is 


la a? a"! 
3 6 Lows 3(n-1) 
H= la a à Q1) 
1 at a Ot X70 


where each entry is replaced by the corresponding binary m-tuple. Of course 
the second column of H need only contain a, a^, a?,... where 1, i, i2,... are 
in different cyclotomic cosets. 


Examples. We begin by listing in Figs. 7.2, 7.3 all the (narrow-sense, primitive) 
binary BCH codes of length 15 and 31. Fortunately the minimal polynomials 
of the elements of GF(16) and GF(32) were given in $4 of Ch. 4. 


designed generator exponents of actual 
distance polynomial roots of dimension distance 
ô g(x) g(x) =n-degg(x) d 
1 l - 15 l 
3 M(x) 1,2,4,8 11 3 
5 M(x)M(x) 1-4, 6, 8, 9, 12 7 5 
7 M'"(x)M?(x)M?((x) 1—6, 8-10, 12 5 7 
9, 11, 13 M?M?M?M? 1-14 l 15 
or 15 — (x - D/(G 9 1) 


Fig. 7.2 BCH codes of length 15. 


designed generator actual 
distance polynomial dimension distance 
ô g(x) = n -deg g(x) d 
1 l 31 1 
3 M? 26 3 
5 MM? 21 5 
7 M?M?M? 16 7 
9 or 11 MOM? M?M? 11 11 
13 or 15 M°?M?M°OM?M®? 6 15 
17, 19,...,31 M°M°M°M?M°OM(® l 31 


Fig. 7.3. BCH codes of length 31. 
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Note that the codes of designed distance 9 and 11 coincide. This is because 
the latter code has generator polynomial 


g(x) = c.m. (M'(x), M? (x), Mx), Mx), M?Xx)). 


But 9€ Cs, so the minimal polynomials of oa? and a^ are the same: M(x) = 
Mx). Therefore 


g(x) = M'(x)M?(X)M(xX)M"(x) 


which is also the generator polynomial of the code of designed distance 9. 
This example shows that a BCH code of designed distance ô may coincide 
with a BCH code of designed distance 5’, where ô’ > ô. The largest such 6' is 
called the Bose distance of the code. From the BCH bound the true minimum 
distance is at least equal to 5’, but may be greater, as the next examples show. 
Finally Fig. 7.4 gives the binary (nonprimitive) BCH codes of length 
n — 23. The cyclotomic cosets are 


Co= {0}, 

C, = {l, 2, 4, 8, 16, 9, 18. 13, 3, 6, 12}, 

C; = (5, 10, 20, 17, 11, 22.21, 19, 15, 7, 14}. 
Since |C.| = 11, the multiplicative order of 23 mod 2 is 11 (see 85). Thus x? + 1 
splits into linear factors over GF(2''), and a is a primitive 23” root of unity in 


GFQ"). 
Over GF(2), x? 1 factors into 


x?" 1-2 (x* DM'"(x)M?Xx), 

see Fig. 7.1, where 

M(x) = x" 4x7 tx74+x°4¢x5 4x41, 

M?(x)- x" +x +x t x +x ++. 
The BCH code with designed distance 8 —5 (and b=1) has generator 
polynomial 

g(x) = Lc.m.(M (x), M?Xx)). 
But M”(x) = M(x). Therefore g(x) = M*'x), and the parity check matrix is 


H - (lo, @,..., a”), 


where each entry is replaced by a binary column of length 11. The dimension 
of the code is 23 — deg g(x) = 12. 

Figure 7.4 shows that the Bose distance of this code is also 5. However, as 
we shall see in Ch. 20, this BCH code is equivalent to the Golay code %.;, and 
has minimum distance 7. Thus here also the minimum distance is greater than 
the designed distance, illustrating the fact that the BCH bound is not tight. 
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designed generator actual 
distance polynomial dimension distance 
6 g(x) = n — deg g(x) d 
1 1 23 1 
3 or 5 M? 12 7 
7,9,...,23. MM? l 23 


« 


Fig. 7.4. BCH codes of length 23. 
We return to BCH codes in Ch. 9. 


Problems. (22) Using a table of cyclotomic cosets for n = 63, find the dimen- 
sions of the BCH codes for 8 = 3,5,7,9, 11. Do the same for n = 51. 

(23) Use the BCH bound to show that the [2" — I, m] simplex code with 
h(x) = M(x) has d=2""'. 

(24) (Hard) Hartmann and Tzeng’s generalization of the BCH bound. 
Suppose that, for some integers c, and c; relatively prime to n, a cyclic code 
has zeros a^'^*^^ for all i, =0,1,...,6—2 and i520, 1,..., s. Show that the 
minimum distance is at least ô + s. (Thus if c, = 1 the code has s + 1 strings of 
68 — 1 consecutive zeros. The BCH bound is the special case s = 0.) 


Reversible codes. 


Definition. A code © is reversible if (c54,c,,...,C, 3, C4.) € € implies 
(Cn-1, C422, ..., Ci, Co) € €. For example (000, 110, 101, 011} is a reversible code. 
So is the [15,6,6] binary BCH code of length 15 with g(x)= 
MC "(x)M?(x)M Xx). 


Problems. (25) Show that the BCH code with 5 — —t and designed distance 
6 = 2t +2 is reversible. 

(26) Show that a cyclic code is reversible iff the reciprocal of every zero of 
g(x) is also a zero of g(x). 

(27) Show that if —1 is a power of q mod n then every cyclic code over 
GF(q) of length n is reversible. 

(28) Melas's double-error-correcting codes. Show, using Problem 24, that if 
m is odd the [n - 27], k 2n — 2m] reversible binary code with g(x)= 
M'""(x)M*Xx) has dz 5. 

(29) Zetterberg's double-error-correcting codes. Let n — 2" 4 1 (i 1) and 
let a € GF(2^) be a primitive n™ root of unity. Again using Problem 24, show 
that the code with g(x) = M‘(x) is a reversible code with d > 5. 
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(30) (Korzhik) Show that if h(x) is divisible by x^ — 1, then the minimum 
distance is at most n/T. 


*$7. Using a matrix over GF(q") to define a code over GF(q) 


This section studies more carefully how a matrix over the big field GF(q") 
can be used to define a code over the little field GF(q). 

First suppose the code is to be defined by a parity check matrix H over 
GF(q"). More precisely, let H =(H,), where H; E GF(q") for 1<isr, 
1<j<n, be anrXn matrix of rank r over GF(q"). Then let €, be the code 


over GF(q) consisting of all vectors a —(a,,...,a,), a € GF(q), such that 
Ha? =0. 
Another way of getting €, is as follows. Pick a basis a,,..., o4 for 


GF(q") over GF(q), and write 
Hy = > Hio, Hy € GF(q). 
i-l 


Define H to be the rm x n matrix obtained from H by replacing each entry H; 
by the corresponding column vector (H,,,..., Him)’ from GF(q). Thus 


Hü- RE NOSE OR AMER RES ERED 
Has Aram T em Has 
Then 
aE €, if > Hua 20 fori=1,...,r 
j=! 


iff X Ha, = 0 fori=1,...,r;l=1,...,m 
i=! 
iff Ha? =0. 


Thus either H or H can be used to define €,. The rank of H over GF(q) is at 
most rm, so €, is an [n, k > n — rm] code, assuming rm <n. 

Of course we could also consider the code 6% over GF(q") consisting of 
all vectors b —(b,,..., b,), b; € GF(q"), such that Hb" =0. Then €f is an 
[n, n — r] code over GF(q"). Since GF(q) C GF(q”), every codeword in €, is 
in €2. In fact €, consists of exactly those codewords of €% which have 
components from GF(q). We will denote this by writing 


€, = €%| GF(q), Q2) 
and call €4 a subfield subcode of €£. 
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In general, if €" is any [n, k", d"] code over GF(q"), the subfield subcode 
€*|GF(q) consists of the codewords of €" which have components from GF(q). 
Then €"|GF(q) is an [n, k, d] code with n — m(n — k" «k =k” and d > d". 

For example, let €* be the [7,6, 2] BCH code over GF(2’) with generator 
polynomial x +a, where a € GF(2’) satisfies a°+a+1=0. Let € be the 
subfield subcode €*|GF(2). The codeword a(x) =(x +a)(x +a’\(x + a*) = 
x'-x-lisin €" and hence in €. Thus € contains the (7, 4, 3] code #3. In 
fact € = X, since € has minimum distance at least 2. 

The trace mapping Tm from GF(q") to GF(q) can be used to express the 
dual of €"|GF(q) in terms of the dual of €^". This mapping is defined by 


T.(x) e x txt c x ^x" , xeGF(q"), 


-see $8 of Ch. 4. Let Ta (€") be the code over GF(q) consisting of all distinct 


vectors 
Tm(b) * (T4(b),.... T4(5.), for be €*. (23) 


Then T,(€") is an [n, k, d] code over GF(q) with k* s k « mk" and d « d*. 


Theorem 11. (Delsarte.) The dual of a subfield subcode is the trace of the dual 
of the original code, or 


(€* | GF(q) = T4(€")). (24) 


Proof. (i) T,((€")') C (€" | GF(q))*. To prove this, if a € LHS we must show 
that a-c —0 for all c E€ €*|GF(q). In fact a —-(T.(bi),.... T4,(b.)) for 
b ec(€*y. Therefore 


ws > T. (bc, = T.(», be.) = T,,(0) - 0. 


(ii) (€"|GF(q))" C T..((€")^), or equivalently 
(Tml E*V) C €" |GF(q) (25) 


To prove (25), if a € LHS we must show that a € €". By definition a c =0 
for all c  (T.(bi, .... T4,(b,)) where b &(€"y'. If be€(€"y so is Ab for all 
A € GF(q"). Therefore 


 aT.Qb) = T. (A 
tat i 


aibi) =0 forall à € GF(q"), 
so 
Dd ab:=0 and ae”. Q.E.D. 


For example the dual of the above [7.6,2] code has generator polynomial 
(x' * Dy/(x t a^) 2 x* - atx’ t ax^* a xt a)x? * a?x +a. The trace of this is 
x54 x^ - x* ^ x?, and in this way we obtain the [7, 3, 4] code #3. 
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Problems. (31) Let C be any invertible rx r matrix over GF(q") and set 
ı= CH. Show that €y,= Gu; i.e. CH and H define the same code. 

(32) As a converse to the preceding, let € be any [n, k] code over GF(q). 
Show that if rm zn -— k there is an rX n matrix H over GF(q") such that 
€ = Bu. 

(33) (a) Let €* be a cyclic code over GF(q") with zeros a' for i € K, where 
a €GF(q?)) Show that €*|GF(q) is a cyclic code with zeros 
a', a^, a/7,..., a^"  fori€ K. 

(b) As an illustration of (a), take €" to be the [7, 5] cyclic code over GF(2°) 
with generator polynomial (x + a)(x + a°), where a is a primitive element of 
GF(2°). Show that €"| GF(2) = X. 


$8. Encoding cyclic codes 


In this section two encoding circuits are described which can be used for 
any cyclic code. We illustrate the technique by two examples. 


Examples. (El) The [15, 11, 3] Hamming code 2, a cyclic code with generator 
polynomial g(x) 5 x^ x * l. 
(E2) The [15, 7, 5] double-error-correcting BCH code with 
g(x) 2 (x*- x * Dx*-x?t xt xl) 
Sete x ea el 
Suppose the message u = Uo, Uy,..., Uio iS to be encoded by code (E1), and 
the corresponding codeword is 


C = Coss ees C Cayo ees Cus 
— a —/§— 


check message 
symbols symbols 


See Fig. 7.5. 





MESSAGE 
SOURCE ENCODER 7 CHANNEL 
Ug "Yo CF Corrs Cars Cig 
Vey VS V 
MESSAGE CHECKS MESSAGE 


Fig. 7.5. 


Encoder #1. The first encoder requires deg g(x) delay elements. c is in the 
code iff the polynomial c(x) is divisible by g(x)=x*+x+1. So we must 
choose Co,...,C3 to make this happen. One way to do this is to divide 


c'(x) = Cyax 4+ cux? DE T cax* 
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by x‘+x+1, giving a remainder r(x) 2 rx^-*-::-:- 7. Then set c; =r; (i= 
0,...,3), for c(x) = c'(x) * r(x) is divisible by x*+x+1. 

To implement this, a circuit is needed which divides by x*+x+ 1. A simple 
example will show how to construct such a circuit. Suppose we divide 


x*cx'c-xc^6x' by x'*x-l, 
using detached coefficients. (Le. we write 10011 instead of x*+x+1, etc.) 


110110 = quotient 
10011)1100110000 = dividend 
10011 
10101 
10011 
01100 
00000 
11000 
10011 
10110 
10011 
01010 
00000 
1010 = remainder 

















The quotient is x^ x*-- x^- x and the remainder r(x) is x°+ x. 

The key point to observe is that each time there is a | in the quotient, the 
dividend is changed 3 and 4 places back. 

Therefore the circuit shown in Fig. 7.6 performs the same calculation. 


0000110011 
DIVIDEND QUOTIENT 
Fig. 7.6. A circuit to divide by x*- x 4 l. 


The remainder (0101 = x + x?) is what is left in the register when the dividend 
has been completely fed in. 


So our first attempt at encoding is: Feed in the dividend (message symbols 
followed by zeros) 


0000c4c..... Cia. 
The remainder when all 15 have been fed in is 
CoC 1C2C3. 


A circuit to do this is shown in Fig. 7.7. 
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CoC CoC 





SOURCE CHANNEL 





Fig. 7.7. Preliminary version of encoder #1. 


The switches have three positions: at A for 11 clock cycles, during which 
time the message is fed into the channel and into the register: at B for 4 
cycles, while 4 zeros enter the register; and at C for 4 cycles. while the 


remainder enters the channel. 
The disadvantage of this scheme is obvious: the channel is idle while the 


switches are at B. 

To overcome this difficulty, we feed the message into the right-hand end of 
the shift register. This has the effect of premultiplying the symbols by x‘ as 
they come in. So instead of the divisor circuit of Fig. 7.6 we use that of Fig. 
7.8. 


QUOTIENT 


Ca» * + Cig 
Fig. 7.8. Division circuit, with premultiplication by x‘. 
The remainder is now available in the register as soon as c, has been fed in. 


The final encoder is shown in Fig. 7.9. The switches are at A for 11 cycles. 
and B for 4 cycles. 







CoC4CaCa 






MESSAGE 
SOURCE CHANNEL 






Fig. 7.9. Final version of encoder #1}. 
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It is clear that a similar encoder will work for any cyclic code, and requires 
deg g(x) delay elements in the shift register. Figure 7.10 shows the encoder 
for Example (E2). 





Fig. 7.10. Encoder #1 for a BCH code. 


Encoder #2. The second encoder requires deg h(x) delay elements. We saw 
above that the check symbols are defined by Equations (11). 
For Example (E2), 
h(x) = (x" + 0/g(x) 
= (x + 1)(x? +x + Do x1) 
=x7+x°+x*+ 1, 
So the codeword satisfies 
C+ Cet Crt Cu 7 0, 
Cot C7 + Co C = 0, 


If Ciu,...,Cs are the message symbols, this defines the check symbols 
C5, Ce... , Co. Figure 7.11 shows the encoder to do this. The switch is 


MESSAGE 
SOURCE 





CHANNEL 


CaCs:** Cia 


Fig. 7.11. Encoder #2 for a BCH code. 
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at A for 7 cycles, at B for 8 cycles. The circuit is shown immediately after the 
last message symbol, cs, has been fed in, and the first check symbol 


C; = Cg t Cio t Cys 
is being calculated. 
Clearly Encoder #2 will work for any cyclic code, and requires deg h(x) = 
n—deg g(x) delay elements. Often one chooses that encoder having the 
smaller number of delay elements. 


Problems. (34) Design Encoder #2 for Example El. 
(35) (a) Show that Encoder #1 corresponds to the following generator 
matrix 


doo oi Gor-1 l 
a a air- 1 0 
G: 10 11 1r-i (26) 
eom oh t 9 m m m | t] À]9] o ] m | | n5 n 0 l > 
ük-i9 4k- 777 Ak-ir-1 
in the sense that if the message U = Uo, >: * Uk-i = Cn-kCn-k-1 7 7 Cni IS input 


to Encoder #1, the codeword c = uG, is obtained. Here ait aux t: 
a,-.X' ! is the remainder when x'*' is divided by g(x), for i=0,...,k-1. 
Thus (26) is obtained from (1) by diagonalizing the last k columns. If we write 
(26) as G, = [A||I], the corresponding parity check matrix is H, = [I| AÑ]. 
(b) Similarly show that Encoder #2 corresponds to the following parity 
check matrix 
1 0 bua br-ik-2 A by-10 
H,- i b, bn DAI b,-20 (27) 


0 al boi Dox-2 CE Doo 
[1|B;] (say), 
and to the generator matrix G; = [B7]|I], where ba x* ' + bu ox" ^ + + + + + bio is 


the remainder when x**' is divided by h(x), for i =0,...,r— I. Thus (27) is 
obtained from (9) by diagonalizing the first r columns. 


Calculating the syndrome. Techniques for decoding cyclic codes will be 
described in $6 of Ch. 9 (BCH codes), 89 of Ch. 12 (alternant codes), 87 of Ch. 
13 (RM codes), and 86 of Ch. 16 (cyclic codes in general, and especially 
quadratic-residue codes). The first decoding step is always to calculate the 
syndrome which essentially means re-encoding the received vector (see 84 of 
Ch. 1). If the code is being used for error detection (not error correction) this 
is all the receiver has to do. 


Problem. (36) (a) Suppose the received vector y = yoyi: s: Yn-ı IS input to 
Encoder #1. Let s,5;- : : s,., be the contents of the shift register when all n 
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digits of y have been fed in. Show that S = (SoS ° s, ,) satisfies S = H.y" 
and hence is the syndrome of y. [Hint: write y(x) 2 (yo+: * + yeux ) 
Qux* tee t yna X") 9 y(x) y(x), and y(x) = qG)g() + f(x), then s(x) = 
È sx' = f(x) + y,(x), etc.] 

(b) Show that after one additional clock cycle the shift register contains the 
syndrome corresponding to xy(x), a cyclic shift of y. 

(c) If Encoder #2 is used to calculate the syndrome the circuit (e.g. Fig. 
7.11) must be modified slightly. Show that Fig. 7.12 will indeed form the 
syndrome S = H,y"’. The switches are in position A for k =7 clock cycles, 
during which time the (possibly distorted) information symbols are fed into 
the shift register. Then the switches are in position B for r = 8 cycles, and the 
output gives S. 





RECEIVED 
VECTOR 





SYNDROME 





Fig. 7.12. Encoder #2 modified to calculate syndrome. 


Notes on Chapter 7 


§2. Prange [1074-1077] seems to have been the first to study cyclic codes. See 
also Abramson [2]. 


§3. Much more about binary and nonbinary Hamming and related codes can 
be found in Abramson [1], Azumi and Kasami [55], Bose and Burton [180], 
Cocke [296], Farrell and Al-Bandar [419], Golay [512], Hsiao [667], Van Lint 
[848, 855], Lytle [867], Marcovitz [913], Peterson and Weldon [1040], Sankar 
and Krishnamurthy [1143] and Stirzaker and Yuen [1281]. 

Two combinatorial problems which are related to perfect codes and 
especially to nonbinary Hamming codes are the coin-weighing and the foot- 
ball pool problems - for details see Bellman [101], Golay [510], Kamps and 
Van Lint [712,713], Katona [748], Van Lint [855], Stanton [1265], Stanton et 
al. [1267] and Zaremba [1452, 1453]. See also the covering problem mentioned 
in §3 of Appendix A. 

For shortened cyclic codes see Peterson and Weldon (1040, §8.10] and 
Tokura and Kasami [1332]. Kasami [728] has shown that these codes meet the 
Gilbert-Varshamov bound. 
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§5. The real factors of x" —1 are called cyclotomic polynomials - see Berle- 
kamp [113], Kurshan [788], Lehmer [811]. See also [1200]. Problem 20 holds 
for circulant matrices over any field. See Muir (974, pp. 444—445]. Lemma 7 is 
a special case of the inversion formula for characters of an Abelian group (cf. 
Mann [907, p. 75]. 


$6. Binary BCH codes were discovered simultaneously by Bose and Ray- 
Chaudhuri [184, 185] and Hocquenghem [658]. The nonbinary codes are due to 
Gorenstein and Zierler [552]. Levinson [826] gives a brief description of BCH 
codes. See also Lum [863]. Problem 24 is from Hartmann and Tzeng [615]; 
see also Hartmann [608, 610]. The references for reversible codes are Hart- 
mann and Tzeng [615], Hong and Bossen [661], Massey [919], Melas [953], 
Tzeng and Hartmann [1350] and Zetterberg [1455]. Problem 30 is from 
Korzhik [778]. 

BCH codes are not always the best cyclic codes. For example Chen [266] 
found [63, 45, 8], [63, 24, 16] and [63, 28, 15] cyclic codes which are better than 
BCH codes (see Fig. 9.1). See also Berlekamp and Justesen [127] and 
Appendix A. 


87. Theorem 11 is from Delsarte [359]. 


§8. The encoding techniques follow Peterson and Weldon [1040], who give an 
excellent treatment of digital circuitry. 


Cyclic codes (cont.): 
Jdempotents and Mattson- 
Solomon polynomials 


81. Introduction 


This chapter continues the study of cyclic codes begun in Chapter 7. In 82 
of that chapter we saw that a cyclic code consists of all multiples of its 
generator polynomial g(x). 

Another useful polynomial in a cyclic code is its idempotent E(x), defined 
by the property E(x)’ = E(x) (82). (It is sometimes easier to find the idem- 
potent than the generator polynomial.) The smallest cyclic codes are the 
minimal ideals (§3). These are important for several reasons: (i) any cyclic 
code is a direct sum of minimal ideals (Theorem 7); (ii) a minimal ideal is 
isomorphic to a field (Theorem 9); (iii) minimal ideals include the important 
family of simplex codes. 

The automorphism group of a code (i.e. the set of all permutations of 
coordinates which fix the code) is discussed in 85. This group is important for 
understanding the structure of a code and also for decoding. 

A useful device for getting the weight distribution of a code is its Mattson- 
Solomon (MS) polynomial ($6). The last section uses the MS polynomial to 
calculate the weight distribution of some cyclic codes. 

Throughout this chapter € is a cyclic code of length n over F = GF(q), 
where n and q are relatively prime. As in $5 of Ch. 7, m is the multiplicative 
order of q modulo n, and a € GF(q") is a primitive n™ root of unity. Also R, 
is the ring GF(q)[x]/(x" — 1) consisting of all polynomials of degree <n — 1 
with coefficients from GF(q). 
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§2. Idempotents 


In Sections 2-4 we restrict ourselves to binary codes of length n, where n 
is odd. 


Definition. A polynomial E(x) of R, is an idempotent if 
E(x) = E(xy = E(x’). 


For example x+x?+x‘* is an idempotent in R, since (x+x?+x‘*)= 
X t x? * x*. So are 1 and x?+x°+x*. In general 


E(x) = S ex' 


is an idempotent iff e; = e, (subscripts mod n). Thus the exponents of the 
nonzero terms are a union of cyclotomic cosets. 
Plainly if E(x) is an idempotent so is 1+ E(x). 


Theorem 1. (i) A cyclic code or ideal € = (g(x)) contains a unique idempotent 
E(x) such that € = (E(x)). Also E(x) = p(x)g(x) for some polynomial p(x), 
and 


E(a') = 0 iff g(a‘) - 0. 


(ii) c(x) € € if and only if c(x) E(x) = c(x). 
(€ may contain several idempotents, but only one of them generates €.) 


Proof. Let x" + 1 = g(x)h(x), where g(x), h(x) are relatively prime. An easy 
consequence of the Euclidean algorithm (Corollary 15 of Ch. 12) is that there 
exist polynomials p(x), q(x) such that 


p(x)g(x)+ q(oh(x)- 1, in F[x]. (1) 
Set E(x) = p(x)g(x). Then from (1) 
P(x)g(x)[p(x)g(x) + q(x)h(x)] = p(x)g(x), 


E(x +0 = E(x) in Rn, 


so E(x) is an idempotent. An n" root of unity is a zero of either g or h but 
not both. From (1), p and h are relatively prime. So if there is an n" root of 
unity which is a zero of p, it must also be a zero of g. Since p(x) doesn't 
introduce any new zeros, by Lemma 5 of Ch. 7 E(x) and g(x) generate the 
same code. To prove (ii), if c(x) = c(x) E(x) then clearly c(x) € €. Conversely, 
if c(x)&€ € then c(x)  b(x)E(x), and c(x)E(x) = b(x)E(xy = D(x) E(x) = 
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c(x). Finally to show that E(x) is the unique idempotent which generates €, 
suppose F(x) is another such. Then from (ii), F(x)E(x) = E(x) = F(x). 
Q.E.D. 


For example, the [7,4,3] Hamming code X, has g(x) » x! - x * 1, h(x) - 
(x - DG -x?-1) = x*- x? x1, and 


xg(x) * h(x) » 1l. 


Thus the idempotent of this code is E(x) = xg(x) = x** x?* x. 
Problem. (1) Convince yourself that E(x) does generate this code. 


Lemma 2. E(x) is an idempotent iff 
E(a)-00r]l fori-0,1,...,n- Ll 


Proof. Suppose E(x) is an idempotent. Then E(a‘) = E(a), so is equal to 0 or 
1 by Theorem 8 of Ch. 4. For the converse, suppose 


E(x)- S &x'. 


Since E(a?) is 0 or 1, E(a?) = E(a?Y = E(a!). By the inversion formula of 
Lemma 7 of Ch. 7, 


E= 5 E(a)a =S S a, 


s j€C, 


where s runs through a subset of the cyclotomic cosets. Thus e; = én, and 
. E(x) is an idempotent. Q.E.D. 


Corollary 3. Dimension of € 
= number of a! for which g(a') 4 0, 
= number of a! for which E(a') ^ 1. 


Problem. (2) Show that 
g(x) = (E(x), x" + 1), 


where (a, b) denotes the greatest common divisor of a and b. 
If a(x) aet aix tcc ta, ax" ! we define 
a*(x)= aot an-ıx +: taux"! 
(so that the constant term is unchanged while the other coefficients are 
reversed). Plainly a(a ') = a*(a). 
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Lemma 4. If E(x) is an idempotent so is E*(x). 


Proof. If 
E(x)=> 2 x! 
then 
E*(x)- X D> x. Q.E.D. 
s jeC . 


Theorem 5. Let € be a code with idempotent E(x). Then €* has idempotent 
(1+ E(x))*. 


Proof. Let ai,...,a, be the n" roots of unity; suppose that a.,...,a, are the 
zeros of €; ie, E(a)-70 for 1<i<t, E(a,)=1 for t+1<i<n. Then 
1+ E(x) has zeros a,.:,...,@,, and (1+ E(x))* has zeros a/4,...,0,'. These 
are the zeros of ¢* by Theorem 4 of Ch. 7. Q.E.D. 


Problems. (3) Show that 1+ E(x) is the idempotent of the code with the 
generator polynomial h(x). 

(4) Find the generator polynomial and idempotent for each cyclic code of 
length 15, and identify the duals. 


§3. Minimal ideals, irreducible codes, and primitive idempotents 


A minimal ideal is one which does not contain any smaller nonzero ideal. 
The corresponding cyclic code is called a minimal or irreducible code, and the 
idempotent of the ideal is called a primitive idempotent. We shall see that 
every idempotent is a sum of primitive idempotents, and that any vector in R, 
can be written uniquely as a sum of vectors from minimal ideals. 

The nonzeros of a minimal ideal must be (a': i € C.) for some cyclotomic 
coset C.. We denote this minimal ideal by .4,, and the corresponding primitive 
idempotent by 6,, and often write M, = (6,). Thus 


1 ifjeC, 
0 otherwise. 


$a) = | Q) 
In particular, @.(x) has the single nonzero x = 1 and is given by 


6x) = (x^ + D/(X-D-— 5 xi. 


We give an explicit construction for 6,(x), using the inversion formula of 
Lemma 7 of Ch. 7. 
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Theorem 6. 


6.(x) = 5 ex' 


i-0 
where 
«= Da" forizQ. 


jec, 


Proof. From Lemma 7 of Ch. 7, 
€i = y 6.(a)a = Yat. Q.E.D. 
j-o j 


jec, 


For example, if n = 7 the coefficients of 0., 0. are 


6: e-a' ta" ta^, 


Si 


0: e=aX*ta%+a™. 


Since 7 = 2— 1, a is in this case a primitive element of GF(2’). 


Problems. (5) Use the table of GF(2) in Fig. 4.5 to show that 6,— 
Ltxtes-+x°, = Lt x4+x?4+ x4, O,= 14+ xh Rx x°. 

(6) If instead GF(2°) is defined by a*+a°+1=0 show that 6, and 9, are 
interchanged. 


Thus using a different polynomial to define the field has the effect of 
relabelling the 8,'s. 


Theorem 7. The primitive idempotents satisfy: 

(i) X, 8, = 1. 

(ii) $0, - 0 ifizj. 

(ili) R, is the direct sum of the minimal ideals generated by the 0,. Thus any 
vector a(x) € R, can be written uniquely in the form 


a(x) = © a(x), 


where a,(x) is in the ideal generated by 6.. 
(iv) If E(x) is any idempotent, then for some a, € GF(2), E(x) can be 
written as 


D> a6. 


Conversely, any such expression is an idempotent. 
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Proof. (i) 


=1 by Lemma 6 of Ch. 7. 


(ii) Let M, M; be the minimal ideals with idempotents 6; 6; M: N M; iS a 
proper sub-ideal of M, hence 0. 6,0; € Mi N M;, thus 6,0; = 0. 
(iii) From (i), 


a(x) 1=a(x) > 6, = € ax), 


where a,(x) is in the ideal generated by 6,. 

(iv) The nonzeros of E(x) are a union of sets of nonzeros of minimal 
idempotents. The result follows from Lemma 2 and (2). The converse then 
follows from (ii). Q.E.D. 


The polynomial 6*(x) is also a primitive idempotent, and so there is a 
unique smallest s' such that 


9*(x) = 0,(x). 


Thus s'& C_,. The nonzeros of 6*(x) are (a: i E€ C_,}. 

If n —2" — 1, 6,(x) generates the [2" — 1, m, 2" '] simplex code Sn = Ht, 
and has weight 2"^'. The nonzero codewords are the 2" — 1 cyclic shifts of 
6,(x). If n — 2" — | and s is prime to "t, a? is also a primitive element and so 
6.(x) generates a code equivalent to Sm. Stated another way, if h(x) is any 
primitive polynomial, the code with check polynomial h(x) is equivalent to 
S ai 


Problems. (7) Using the table of GF(2‘) given in Fig. 3.1 of $3.2, show that the 
primitive idempotents for n = 15 are: 


ba = OF = x" "x". xxl 

0,— OF — x" c x? e x*9- x89 x*- xo x? x, 

0,—- OF — x" -x "ex" x" 4 xP te x74 x84 x, 

805 = OF = (x" - x P x" x") E (x? 9 x5 x! x9) - (x* - x? - x! x), 
85 = OF = (x " - x ) t (x! Fx (x*- x) G8 9 x*) - GP 9 x). 


(8) Similarly show that the primitive idempotents for n —31 are (only the 
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exponents are given): 
90, = 0$ = (0,1,2,...,30), 
9, = 6% = (0, 5, 7,9, 10, 11, 13, 14, 18, 19, 20, 21, 22, 25, 26, 28), 
6, = 0f = (0, 3, 6, 7, 12, 14, 15, 17, 19, 23, 24, 25, 27, 28, 29, 30), 
0. = 0% = (0, 1, 2, 4, 5, 8, 9, 10, 15, 16, 18, 20, 23, 27, 29, 30), 
6, = 0$ = (0, 1, 2, 3, 4, 6, 7, 8, 12, 14, 16, 17, 19, 24, 25, 28), 
61, = 0* = (0, 1, 2, 4,8, 11, 13, 15, 16, 21, 22, 23, 26, 27, 29, 30), 
0, = 0% = (0,3, 5, 6, 9, 10, 11, 12, 13, 17, 18, 20, 21, 22, 24, 26). 

The primitive idempotents for n = 63 are shown in Fig. 8.1. Here the 


idempotents are written in octal with lowest degree terms on the left, e.g., 
b -ztzivcz'e-z 8 zP e. 


82321026251170156307277 
6=012231301223130122313 
6.044160277124317353233 
6=044044044044044044044 
6,723516472351647235164 
6.=010305172162267315277 
6:=375263355116136243020 
064-323112032311203231120 
61=333333333333333333333 
6,=331327363052375016044 
67=456271345627134562713 
61=375343166036225150213 


Fig. 8.1. Primitive idempotents for length n = 63. (Here a is a root of 1+x+x7+x°+x°=0.) 


(9) Find the primitive idempotents in R, for n — 5, 7, 9, 11. 
(10) If £, has idempotent E, and £% has idempotent E,, then (a) £, N s£; 
has idempotent EE», 
(b) »£, U £- (the smallest ideal containing both %, and %2) has idempotent 
E,+ E+ E,E; 
(11) If the idempotent of € is 


6+0,+---+6, 
the nonzeros of the code are a” for v E GŒ U---UC,. If the idempotent is 
14+6,+6,4+-°--+, 
the nonzeros of € are a” for v€ G U>- -U Cr 
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(12) Show that the Hamming code Xm has idempotent | + 6,(x), and the 
[2^ — 1,2" — 1 — 2m, 5] double-error-correcting BCH code has idempotent | + 
8x) + 64x). 

(13) Study the [9,6,2] irreducible code with idempotent 6,(x) = x'- x*. 
Find all the codewords and show that the weight distribution is Ao = l, A2= 9, 
Ag= Ae = 27. 

(14) Call two codes €,, €. of length n disjoint if €, N $.=0. Let €,, €, be 
cyclic codes with idempotents E., E. and generator polynomials g., g». (i) 
Show that €,, €. are disjoint iff, when E,, E, are written as sums of primitive 
idempotents 6, (cf. Theorem 7), no 6, occurs in both. (ii) Show €,, €. are 
disjoint if and only if x"+1|g.g.. (ii) Show that dist(€,, €) - 
min {dist (u, v): u € €,, v € €,, not both zero) - min.dist. of the code with 
generator polynomial g.c.d. {g,, g.} and idempotent E, + E, + E,E.. 

(15) (Lempel) Let 


n(x) = 2 x 
Show that in GF(2)[x] — i.e. not mod x” + 1— 
n(x + m) = OC +X"), 


where 7{(x) is the sum of the odd powers of x occurring in 7,(x). 

(16) Show that there are g(n)/m minimal ideals equivalent to Sm (and the 
same number of cyclic codes equivalent to Xm), where y(n) is the Euler 
function of Problem 8, Ch. 4. 

(17) (Hard) Let h(x) be a polynomial which divides x" + 1. Let E(x) be the 
idempotent of the ideal with generator polynomial g(x) = (x" + 1)/h(x). Show 
that 





E(x) = (x^ + (+ 8), 
where h'(x) is the derivative of h(x), and 8 = 0 if the degree of h(x) is even, 
and | if the degree of h(x) is odd. Hence show that 


hoo0o- I , l 
E(x) = xr" g(r (1), 
where r(x) = x**""h(1/x). [Hint: Show deg E(x) «n - 1. Then show E(a;) = 
1 if h(a;)=0 and E(a;)=0 if h(a) #0.) 


Degenerate cyclic codes. A cyclic code which consists of gen repetitions of 
a code of smaller block length is said to be degenerate. For instance (6o) = 
{0,1} is degenerate since it consists of several repetitions of the code (0, 1}. 
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Problem. (18) For n = 15, verify that 0; and 6, are idempotents of degenerate 
ideals. Find the dimensions of these ideals. 


Lemma 8. (g(x)) is degenerate if and only if the check polynomial h(x) divides 
x' € 1 for some r « n. 


Proof. If (g(x)) is degenerate, every codeword, including g(x), is of the form 
S(xY(1 t x' t x" t ----x"7)for some r. Thus r divides n (Problem 11 of Ch. 


4) and 
a(x) = s(x)x^ + D/G*' + 1), h(x) = (x' + Dis(x). 
Conversely, let r «n be the smallest integer such that h(x) divides x’ +1. 
Then r divides n (for if not, h(x) divides x" +1 where r' — (r, n)). Thus 
gj ct hoe 
BO) Ga) x’ +1 AG) 








= S(x)\L +x" +--+ +x"). 


Every codeword is of the form 
a(x)s(xY1- x' $x o x"), 


where by Theorem 1 of Ch. 7, dega(x)s(x) « r. Q.E.D. 


Problem. (19) Show that for n = 15, if s = 1,5 or 7, (8,) consists of 0 and all 
cyclic shifts of 6,. On the other hand (43) consists of the vectors |u|u|u| where 
u is in the [5, 4, 2] even weight code. Algebraically (63) consists of all elements 
a(x)(x t x? - xt x*K1 - x? - x'?), where deg a(x) «3. 


Remark. We now have a method of telling whether an irreducible polynomial 
h(x) of degree m is primitive. (See §2 of Ch. 3, $4 of Ch. 4.) We form the 
idempotent E(x) = (x" + 1) - (xh'(x)/h(x) - 8) where n = 2" — 1 (Problem 17). 
If this idempotent has 2" — 1 distinct cyclic permutations the polynomial is 
primitive; if not, the code generated by E(x) is degenerate. 


Problems. (20) Do this for the polynomial h(x) = x*- x? - xxl. 

(21) Show that: 

(a) If (s, n) = 1 then |C,| = m, deg M*(x) = m, and the ideal generated by 6, 
has dimension m and is nondegenerate. 

(b) If (s, n) » 1 the ideal is degenerate, and its dimension divides m and 
may equal m. 

(c) If s is relatively prime to n = 2" — 1, the ideal consists of the codewords 
0, x'0, for i=0,1,...,2"—2. 

(22) Let € be a linear code with no coordinate always zero, in which there 
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is only one nonzero weight. Show that there is a minimal ideal M, such that € 
is equivalent to the code {|uļu| - - : |u|: u € At). 


A minimal ideal is isomorphic to a field. 


Theorem 9. The minimal ideal 4t, — (0,) of dimension m, is isomorphic to the 
field GFQ™). 


Proof. This will follow if we show that if a,(x), axx) are in Jf, and 
a,(x)a,(x) = 0, then either a,(x) or a(x) is zero. 
Suppose a,(x) #0. Let 


N = (b(x) € M,: b(x)a,(x) = 0}. 


N is an ideal. Since M, is minimal and N C M. either N = M. or N — 0. By 
Theorem 1 6,a,(x) = a,(x), so W z M.. Therefore a;(x) ^ 0. Q.E.D. 


For example, the ideal (8) is a highly redundant picture of the ground field 
GF(2). 


Lemma 10. An isomorphism q between M, and GF(2":) is given by: 
if a(x) € Ms, a(x)? =a(B), 


where B € GF(2"*) is a primitive n" root of unity which is a nonzero of Ms. (Of 
course different choices for B give different mappings.) i 

For example, consider the [7,3,4] code with idempotent 9,(x)= 
l+x+x’?+x* Using Fig. 4.5 we may take B=a, a? or a‘, giving the 
mappings from the code to GF(2°) shown in Fig. 8.2. 


Codeword a(x) Elements of GF(2)) 


0123456 ala) ala’) ala’) 











0000000 0 0 0 
1110100 1 l 1 
0111010 a a? a‘ 
0011101 a? a* a 
1001110 a? a a 
0100111 a‘ a a? 
1010011 a? a? a 
1101001 a a? a? 


Fig. 8.2. Three mappings from 59, to GF(2’). 


226 Cyclic codes (cont.) Ch. 8. §3. 


Proof. Suppose for simplicity that B = a’. Let £ be a primitive element of 
GF(2™:), and consider the mapping V from GF(2™:) to M, defined by 


ey = È T. (EB. G) 


We shall show that v is the inverse mapping to g. Let the RHS of (3) be 
denoted by a(x). We first show that a(x) € M,. 


a (a*) = > Tm, (£a *)a^* 


m,—i -4 
OE i25, (k—-s2!yj 
=) 250 


i=0 j=0 
i t if k = s2' for some l, 
0 otherwise. 


Thus a*(a*) 2 0 unless k € C,; hence a? (x) € M,. Then it is immediate that 
a(x) = aX) = či. 

From Theorem 9, o must be the inverse of y and both maps are 1-1 and 
onto. Also e clearly preserves addition and multiplication, hence is an 
isomorphism. Q.E.D. 


The idempotent of M. maps onto the unit of GF(2™). If ci, c; € M, and 
ci(x)ex(x) = 6,(x) then c, and c; are inverses in M.. (Of course cı, c; have no 
inverses in R,.) Note also that (xa(x))* = Ba(B). 


Idempotents of cyclic codes over GF(q). Let € be a cyclic code of length n 
over GF(q), where n and q are relatively prime. An element E(x) of R,, the 
ring GF(q)[x]/(x" — 1), is an idempotent if E(x) = E(x)’. 


Problems. (23) Show that there is a unique polynomial in € which is both an 
idempotent and a generator. 
(24) If € has idempotent E(x), show that €+ has idempotent (1 — E(x))*. 
(25) Show that there is a set of primitive idempotents ĝo, 6:,..., 0, such 
that 


62-0, 00-0 if ixj, > 6=1. 


i-0 


Also R, is the direct sum of the minimal ideals generated by the 6,. 
(26) Show that the minimal ideal of dimension m, generated by 6, is 
isomorphic to the field GF(q"). 
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$4. Weight distribution of minimal codes 


Let € be an [n,k] nondegenerate minimal cyclic code, with nonzeros 
B, B^, ..., B" '. Then ns = 2* — 1, and k is the smallest integer for which such 
an equation is possible. (Thus k = m.) We suppose s > 1. 

Let £ be a primitive element of GF(2'), with B = £&. Let A(x) € € be the 
codeword such that A(x)” = & Then (xA(x)* = &**'; in fact, the n cyclic shifts 
of A(x) correspond to &, &**', £7*',,.., £^^"*"', Thus the codewords of € 
which correspond to 1, £, £,...,£''' are a complete set of cycle represen- 
tatives for €, and determine the weight distribution of €. 


A way to find the cycle representatives. Consider the [2* — 1, k] simplex code 
Se with nonzeros £,£,..., i£ ',. Codewords of 5, are written as polynomials 
in y, where y”~'= 1. The idempotent of % is 


2k—2 


E(y)- È &y, & = TE>). 


The coefficients of E(y) may be arranged in an s x n array 


€o € C17 Gnas 
€i essi cct Ein-1)s+1 

(4) 
ês- €55-1 Ens- 


Let 


n-i 
alx) = M eux’, for0«jses-l. 
f=0 


Theorem 11. c(x)€ €, and c,(x)* = E”. Thus the c;(x) are a set of cycle 
representatives for €. 


Proof. 
cx) =F BEA! =F raw 
which is the idempotent of €. Then 
ott) =F TEx! 


and hence c¢,(x)? = € *. Q.E.D. 


Clearly any cyclic shift of the idempotent will also give a set of cycle 
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representatives. A cyclic shift of the idempotent of A, is easily found as the 
output of the associated shift register (see Figs. 14.2, 14.3). 

For example, the [15,4, 8] code J, has idempotent 000 100 110 101 111 
from Equation (1) of Ch. 14, or Problem 7 of Ch. 8. Arranged as in (4) with 
s =3, n =5 this becomes 

01111 
00101, 
00011 


giving a set of cycle representatives for the [5, 4, 2] code. 


Problem. (27) Show that the weight distribution of the [21,6] code is 


i: 0 8 12 
Ai: 1 21 42 
A special case. Suppose the cycle representatives are 8(x), A.(x),..., As—i€X), 


where 6(x)* = 1, Ai(x)® = €'. Clearly Ax(x) = A(x)? = Ai(x?), so Ax(x) has the 
same weight as A,(x). A particularly simple case is where all the A;(x) have the 
same weight as A,(x). For example, if n = 51 = Q*— 1)/5, A(x)* = £, A (x°) = 
E, A(x*)* = Et, A(x?) = E = E. Thus A(x*) is a cyclic shift of the codeword 
corresponding to £'. Hence the code has only two nonzero weights, namely 
wt (6(x)) and wt (A(x)). 

In general, if s is any prime for which 2 is a primitive root [i.e., there are 
only two cyclotomic cosets mod s, Co and C;], the code M, has at most two 
weights, r, and 7; say. M, contains n codewords of weight 7, which are the 
cyclic shifts of 6,, and n(s — 1) of weight 72. 


The weight distribution when there are two nonzero weights. Suppose M, has 
only two nonzero weights 7;, T2, where the idempotent 6, has weight 7,. The 
dual code has minimum distance 3 by the BCH bound. From Theorem 1 of 
Ch. 5, 


1+ ny" n(s—10)y? 22"^((1- y! + AX + yy ((1—yy o: 
Differentiating twice and setting y = 1 we obtain 
T; +(s— 101) 22", 
Trí(7, — D) + (s — Dr; 1) = (n — 12777, 


Solving for 72 we find 


magn FS = 3! 
s 
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Thus m must be even, and 


mi2 
ge (ZTEL 1) if s divides 27? + 1I 
Ti — 


m/2-1 2"? —] : A mi2 
2 (+1 if s divides 277 — 1. 


Figure 8.3 gives some examples of two weight codes. 


n m s Ti T2 
21 6 3 8 12 
51 8 S 32 24 
85 8 3 48 40 
93 10 ll 32 48 

315 12 13 128 160 

341 10 3 160 176 

819 12 5 384 416 

1365 12 3 704 67 
13797 18 19 262* 272 
Fig. 8.3. 


Remark. Even if 2 is not a primitive root of s there are many other cases 
where the code has only 2 nonzero weights - see Problem 5 of Ch. 15. 


$5. The automorphism group of a code 


Let € be a binary code of length n. Any permutation of the n coordinate 
places changes € into an equivalent code having many of the same properties 
as € (same minimum weight, weight distribution, etc.). 


Definition. The permutations of coordinate places which send € into itself- 
codewords go into (possibly different) codewords - form the automorphism 
group of €, denoted by Aut (€). 


Problem. (28) Show that Aut (€) is indeed a group. 


A typical permutation a of the symbols {1,2,...,} sends each i into z(i) 
(or, in a more convenient notation, into im). The vector c —(c;,...,c,) goes 
into ca = (Coo... Crin»). * If p is another permutation, the product mp means 


* An equally good, although different, definition is cv = (C-t .... Cant) 
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"first apply 7, then p". Thus c(mp) = (c7)p. E.g. If m = (123), p = (14), 
(C1, C2, C3, C4) = (C2, C3, Ci, C4), 
(Ci, C2, C3, C4) T = (C2, C5, Cay C1). 


Thus Aut(%) is a subgroup of the symmetric group S, consisting of all n! 
permutations of n symbols. 


Examples. (1) The automorphism groups of the repetition and even weight 
codes are both equal to Sn. 
(2) The group of the code 


— 
N 
Uu 
A 





—— O © 
—— OO 
- O =- © 
-O -= © 


consists of these 8 permutations: 


(1), (12), (34), (12)(34), (13X24), (14)(23), (1324), (1423). 


Problems. (29) If € is linear, Aut € = Aut €". 

(30) If € is linear and €, is obtained from €, (a) by adding a parity check, 
or (b) by adding the vector 1, then Aut €, D Aut €. In case (b), if € has odd 
length and only even weights, then Aut €, = Aut €. 

(31) Show that €, C €,.does not imply that Aut €, D Aut ©. 

(32) Let € be the [12, 6, 3] code with generator matrix specified by Fig. 8.4. 
Thus row 1 has ones in coordinates {1236}, row 2 in {345},.... Show that 
Aut (€) contains only the identity permutation. 

It is in general difficult to determine the complete automorphism group of a 
linear code, and even more difficult if the code is not linear. We shall see in 
Ch. 16 how the automorphism group may be used for decoding. 

Let € be an [n, k] linear code with generator matrix M; M contains k 
linearly independent rows. 





Fig. 8.4. A [12, 6,3] code with trivial automorphism group. 
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Lemma 12. The permutation of coordinate places represented by the n xn 
matrix A is in Aut € if and only if 

KM = MA 


for some invertible k x k matrix K. 


Proof. MA is a generator matrix for € if and only if the corresponding 
permutation is in Aut(€). MA can be obtained from M by the linear 
transformation K. Q.E.D. 


Example. The [7, 3, 4] simplex code $^. has 
0123456 
1110100 
«s 11101 | 
0011101 
The permutation o = (0)(124)(365) sends codewords into codewords. In fact 
0123456 
1110100 100 
unz (o 10011 )-( l e 


0111010 010 


Definition. The set of all invertible k x k matrices over a field F is called the 
general linear group and is denoted by GL(k, F). If F is a finite field GF(q) we 
write this as GL(k, q). 


Theorem 13. The general linear group GL(k, q) has order 
(q* — 1)(q* — aXa* — 4): :: (4^ — q* )). 


Proof. Let K be a matrix in GL(k, q). The first column of K can be any 
nonzero vector over GF(q), hence can be chosen in q* — 1 ways. The second 
column must not be a multiple of the first, hence can be chosen in q* —q 
ways. And so on. Q.E.D. 


By Lemma 12, if the columns of the generator matrix are distinct the automorphism 
group of'a binary linear code of dimension k is isomorphic to a subgroup of GL(Kk, 2). 


Problems. (33) (Hard.) (a) Suppose no coordinate of € is always zero. Then 
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for any K € GL(k, 2), KM is another generator matrix for the same code €. 
Show that K corresponds to an element m of Aut (@) (i.e., the n x n matrix A 
corresponding to m satisfies KM = MA) iff K preserves the weight of every 
codeword. (b) Hence show that the automorphism group of the simplex code 
Fm is isomorphic to GL(m, 2). 

(34) Other groups associated with a code. Suppose € is an [n,k] code 
over GF(4), and let 


4, ={A E GL(n, q): uA = u for all u E €), 
€, —(A € GL(n, q): uA € € forall u E €). 
Show that 


n—k-1 
EA = qg II (q^^* -q"*5, 
i=l 
EN k . n—k : 
[$4 a^ II - 0- [ITé' - 0. 
Next we give a useful property of weight distributions. 


Theorem 14. Suppose € is a code of length N in which all weights are even and 
with the property that no matter which coordinate is deleted, the resulting 
punctured code (cf. $9 of Ch. 1) has the same weight distribution. If (A;) is the 
weight distribution of €, and (a;) is that of any of the punctured codes, then 





2jAj 
az- =, 
N -2j 
az = N l] Ay. 


Furthermore the punctured codes have odd minimum distance. 


Proof. Consider the array £ whose rows are the Az; codewords of weight 2j in 
€. This array contains 2jA;; l's. By hypothesis the number of I's in each 
column of Z is the same, and is equal to a,,-,. Thus 


2jA 
azj-1 = A (5) 
The second formula follows because Ay = ay + ax-1. Q.E.D. 
A group G of permutations of the symbols {1,..., n} is transitive if for any 


symbols i, j there is a permutation m € G such that im = j. More generally G is 
t-fold transitive if, given t distinct symbols i, i2, . . . , i, and t distinct symbols 
ju jz} ..., j, there is a m € G such that iim =j... im je 
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Corollary 15. Suppose € is fixed by a transitive permutation group. Then (i) 
deleting any coordinate place gives an equivalent code €*, and (ii) if all 
weights in € are even then €* has odd minimum weight. 


We shall see later that many extended cyclic codes do have transitive 
automorphism groups. But beware! Theorem 14 does not apply to all cyclic 
codes, as Problem (35) shows. 


Problems. (35) Let € be the degenerate [9, 3, 3] cyclic code generated by 
x^ x! * 1. Show that the extended code does not satisfy the hypothesis of 
Theorem 14. 

(36) Show that if € has length N and is fixed by a transitive permutation 
group then N | iA; where A; is the number of codewords of weight i. 

(37) Let € be a code of length N in which all weights are even, which is 
invariant under a transitive permutation group, and has weight enumerator 
W(x, y). Let €, with weight enumerator W,(x, y) be obtained from € by 
puncturing any one coordinate, and let €; with weight enumerator W2(x, y) be 
the even weight subcode of €,. Show that 
à 


WG.y) and Wy) x (art 2 Wo n». 


EN A 


W(x, y)= N ox 


(38) (Camion.) Suppose Aut (€) is t-fold transitive, where € is an [n, k, d] 
cyclic code over GF(q). Show that k z (n — t 4 I)/(d — t 4 1). 


The automorphism group of a cyclic code. By definition the automorphism 
group of a cyclic code contains all the cyclic permutations, i.e., the cyclic 


permutation (0, 1,2,...,n — 1) and all its powers. 
Because n is odd, the map o: x-»x? is a permutation of R, (for o 
permutes the basis 1,x, x?,..., x" ). Now a(x)o, = a(x’)= a(x) is in the 


same code as a(x). Therefore the automorphism group of a cyclic code 
contains c as well as all the cyclic permutations. o; is a permutation of order 
m, where m = |C ||. 


Problems. (39) Find an automorphism group of the code of Fig. 5.1. 
(40) Show that o; and T «(0,1,...,n-— 1) together generate a group of 
order mn consisting of the permutations {o5T’, for 0 i « m and 0<j <n}. 


Example. The effect of o; on the codewords of the simplex code S is shown 
in Fig. 8.5. 
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M 
E 
a 
^ 
o 


LXxx^xx-xXx 
1110100 E(x)-»E(xy 
011 10 10  a(x)= a(x) 
001 1 10 1] a(xy 
0100 1 1 1 a(x) 

100 1 1 10 b(x)-b(xy 
110 1001 bí(xy 

101. 00 1 ] b(xyY 

Fig. 8.5. 


Equivalence of cyclic codes. We now consider R, to be the group algebra of a 
cyclic group G of order n (rather than as a polynomial ring). The mappings 
9, : X' ^ x", where wis an integer prime to n, form a group G.of automorphisms 
of G. [An automorphism of a group G is a mapping ø from G onto itself which 
preserves multiplication: a(ab) — o(a)o(b).] Thus 4$ permutes the coordinate 
places of R,, and sends cyclic codes into cyclic codes. 

For example, if n = 7, p = 3, o, is the permutation (1)(x, x^, x^, x^, x‘, x?) of 
G (or of the coordinate places) Thus ec, interchanges the idempotents 
0,— | - x - X? * x! and 0,— 1 x? x^ x^, and so also interchanges the cyclic 
codes generated by these idempotents. 

4 is a multiplicative abelian group, isomorphic to the multiplicative group 
of integers less than and prime to n, and has order y(n) (see Problem 8 of Ch. 
4). 

The mapping i > iw, where y is prime to n, permutes the cyclotomic cosets. 
For example if n = 31 the mapping i 3i has the following effect: 


Co Co, C,—5 C37 C. Cs C; Ci C, 


(cf. Fig. 4.4). 

On the other hand if u is a power of 2, the mapping o,: i iu fixes the 
cyclotomic cosets. Similarly the mapping oc, fixes every cyclic code. Hence to 
find the permutations which actually change cyclic codes we must factor out 
the subgroup {o;: i € Cj) from €. The quotient group consists of one o, from 
each cyclotomic coset containing numbers prime to n, and has order g(n)/m. 

For example, when n — 63 the cyclotomic cosets containing numbers prime 
to n are 

Cs={ 5 10 20 40 17 34}, 
Cy — (11 22 44 25 50 37}, 
Cy, = (31 62 61 59 55 47}, 
C3 = (23 46 29 58 53 43}, 
Ci; = (13 26 52 41 19 38}, 
C,={1 2 4 8 16 32}. 
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The boldface numbers are the powers of 5 mod 63; therefore in this case the 
quotient group is a cyclic group of order 6. 

The effect of ø, on the primitive idempotents (or on the cyclotomic cosets) 
" 0, > 0, 0,, 804, 055 > O13 > 6, 
05 > Ox, 
0, 0,, 0, 


0; 0; 


0; > O27 > 0, 


We observe that o has sorted the primitive idempotents into classes. Those 
in the first class are the nondegenerate cyclic codes of length 63. Those in the 
second are repetitions of a cyclic code of length 3, those in the third of length 
21, in the fourth, 9, and in the fifth, 7. A moment’s reflection shows that this 
must always happen: any nondegenerate idempotent of block length n may be 
obtained by applying a suitable o, to 6,. 

Thus all nondegenerate minimal ideals of the same block length are 
equivalent. 

Of course using ø, we can see that many other codes are equivalent. E.g. 
the codes of length 63 with idempotents 


9, * Os, 05+ O11, O11 + 05, 0, + O23, Oz + O13, O +O: 


are equivalent. 


Problem. (41) (i) For n =7 show that i 3i maps 
C, C,2C. 
(ii) For n = 15, i 7i maps 
C;25 C;25C, C; Cs, C; C.. 
(iii) For n = 127, i 53i maps 
C> O> Cy > Cn > Cu > Co > Ca > 
> C7 Cai > Cas > C31 > Css > Cio > 
> Czn > Cn > Cs > Cis > Ca > C.. 


The automorphism group of BCH codes. Let € be a code of length n = 2” — 1. 
We label the coordinate places by nonzero elements of GF(2"), i.e., 
1,2,a7,...,o07"?, We add an overall parity check place labeled by œ, cor- 
responding to the zero of GF(2"): 


coordinate: 01 2 ::: n-1 œ 
corresponding 
element of 
GFQ"): laa? --- a" Q0 
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Definition. The affine group. Let P.» permute the elements of GF(2") by: 
P.s: a! > ua! + v, 0>v 
where u, v € GF(2"), ux 0. P.*o is a cyclic shift of k places, fixing 0, i.e. fixing 


the coordinate oo. 
E.g. for length n = 7, with GF(2’) as in Fig. 4.5, the permutation Paa sends 


125022a-2a*22a55a?5a?]1, a> a^ 


Thus 1 a aa a aat 0 
codeword: Co Cı €» C3 Ca Cs Ce Cw 
permuted codeword: c. c4 Co C2 Ce Cs C3 €i 

The set of all P.» forms a group called the affine group on GF(2"). This 


group has order 2"(2" —]) and is doubly transitive. (We shall use a k- 
dimensional generalization of this group in Ch. 13). 


Problem. (42) Show that this group is doubly transitive: if i, z i, jı Æ jo, the 
equations ua*-- v 2 a^, ua^- v =a}, or ua^* v — a^, ua^* v 20 have a 
solution for u, v € GF(2”). 


Theorem 16. Let € be a primitive BCH code of length n = 2" — 1 and designed 
distance ô, and let € be the extended code. Then the automorphism group of € 
contains the affine group on GF(2"). 


Proof. Let P,, be any permutation in the affine group. Let c= 


(Co, Cis ..., Cn-1 C.) be a codeword of weight w in Q, with 1’s in coordinates 
corresponding to the elements X,,..., X, in GF(2"). Thus X, = a^ if c4 = l, 
X, =0 if c,— l. 

Let S, 2 E, X, k 20, 1, 2,..., where 0°= 1. Then Se= Xt,15w-0 


since c € €. Also S, = c(a*) = 0 fork = 1,2,...,8—1 since € is a BCH code. 
Let X;— uX;- v be the locations of the 1’s in the permuted codeword. 
Then 


Si 


È (X) = D UX; vy 


k 
5 (Dev Xt= > (Dues 
i-0 l i i=0 l 


=0 for 0sksô-l. 


Therefore the permuted codeword is also in ẹ. Q.E.D. 
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Corollary 17. Let € be cyclic and @ the extended code. If Aut (€) is transitive, 
then € has an odd minimum distance d, and also contains codewords of 
(even) weight d +1. In particular this applies if € is a primitive binary BCH 
code. 


Proof. From Theorem 14, if a; is the number of codewords in € of weight i, 


(N — 2j)asj-; = 2jax.. Q.E.D. 


Problem. (43) Modify the proof of Theorem 16 to obtain the following 
generalization due to Kasami, Lin and Peterson. For a positive integer i let 
J(i) denote the set of positive integers whose binary expansion is "covered" 
by that of i. Thus if i= È 5,2’, where ô, = 0 or 1, then 


Jü)-(jj2EXe2, e-0orl, ex6]. 


Let € be a cyclic code of length 27—1 for which 1 is a nonzero, and let Ê be 
the extended code. Then € is invariant under the affine group iff whenever a' 
is a zero of €, so is a! for all j € J(i). 


Theorem 18. If € is a binary code which is invariant under a t-fold transitive 
group G, then the codewords of each fixed weight i in € form a t-(n, i, A) 
design, where A = AIC). 


Proof. Let S(P,,..., P.) be the set of codewords of weight i which contain 
I's in coordinates P,,...,P.. Since G is t-fold transitive, |S(P.,... , P.)| is 
independent of the particular choice of P,,...., P,. Therefore the codewords 
of weight i form a t-(n, i, A) design. From Theorem 9 of Ch. 2, A; = AQ)/(). 

Q.E.D. 


Corollary 19. The codewords of each weight in an extended primitive binary 
BCH code form a 2-design. (But compare Theorem 15 of Ch. 2.) 


Problems. (44) Show that the minimum distance of the double-error-correcting 
BCH code of length 2" — I is exactly 5 if m z 4. 

(45) Let a + € be a coset of € and m € Aut (€). Show that «(a + €) is also 
a coset of €. 

(46) (Camion.) Corresponding to the r" matrix (aj) in GL(m, q) define the 
column vector x,-—(4àisdi,...,ü«») of length m^, for r-1,...,n- 
IGL(m, q). Show that the code over GF(q) with m^xn generator matrix 
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G = (Xi, X2,- . . , Xn) has parameters 


n=|](q"-q'), k-m,  den(q"-q4-(aq" -a" )K(q" —aXa" — 1). 


The automorphism group of a nonbinary code. 


Definition. A monomial matrix is a matrix with exactly one nonzero entry in 
each row and column. Thus a monomial matrix over GF(2) is a permutation 
matrix, and a monomial matrix over an arbitrary field is a permutation matrix 
times an invertible diagonal matrix. 

Let € be a code of length n over GF(q). We first suppose that q — p =a 
prime. 


Definition. The automorphism group Aut (€) of a linear code € consists of all 
n X n monomial matrices A over GF(p) such that cA € € for all c € €. 


For example the [3, 1, 3] code € = (000, 111, 222) over GF(3) has automor- 
phism group of order 2.3! = 12. For the code is fixed by any permutation of 
the coordinates, and by multiplying each codeword by 2, i.e. by the monomial 


matrix 
200 
020 
002/° 


Thus Aut (€) consists of the monomials 


100 \ / 100 \ (001 \ /010 /010\ /001 
EXE: Jes) so en fem) 
001/ \010/ 4100/ \001/ 4100/ 4010 
200 \ / 200 /002\ /020\ /020\ /002 
mo) feno (a) EET 

002/ \020/ \200/ \002/ \200/ \020/° 


Problem. (47) Show the automorphism group of code #6 of Ch. 1 has order 
48. 

If q is a prime power, the automorphism group of € also contains any field 
automorphisms of GF(q) which preserve €. An example will be given in Ch. 
16. 
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$6. The Mattson-Solomon polynomial 


In Ch. 7 the vector a = (ao, a1, . . . , an-ı) was represented by the polynomial 
a(x) = aot a,x +-+++a,-.x""'. We now introduce another polynomial as- 
sociated with a, the Mattson-Solomon polynomial A(z). Let F = GF(q), 
F = GF(q"), and a € F be a primitive n" root of unity. 


Definition. The Mattson—Solomon (MS) polynomial associated with a vector 
a —(09,05,...,0,.)), à; € V, is the following polynomial in F[z]: 


A(z) = > Az", (6) 
where 


Aj7a(a) S aa*, j=0,+1,+2,.... (7) 
i=0 
(N.B. A(z) is not to be taken mod z” — 1.) Alternative forms for A(z) are 
n-l 
A(z) = > Az 
j=0 


= 2 a; b (a zy. (8) 


For example the MS polynomials of the codewords 1-x  x?-4 x* and 
X(1 4 x - x? x*) in the [7,3,4] simplex code J, are z?- z^ z* and a‘z3+ 
a?z? + az* respectively, where a € GF(2)) (using Fig. 4.5). 


Remarks. (1) The coefficients A; are given by 


Ao 11 1 l do 
Ai la a? z= a, 
Ai Z 1 a a‘ 2(n-1) a, 
An-1 la" e a| | anı 


For this reason A(z) is sometimes called a discrete Fourier transform of a; 
however, we shall always refer to it as the Mattson-Solomon polynomial. 

(2) If the a, € F, then (A,;)* = Aj, for all j (subscripts mod n). 

(3) A narrow-sense BCH code of designed distance ó can now be defined 
as all vectors a for which A, = A,=---=As5_,=0. 

(4) We apologize for using A; both for the number of codewords of weight 
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i and for the coefficients of the MS polynomial. We hope the meaning will be 
clear from the context. 

Theorem 20. (Inversion formula.) The vector a is recovered from A(z) by 


a =} Ala’), id E E E 
a - 1 (AQ), A(a),..., Aa"), (9) 


a(x) = + z Ala')x'. 


Proof. Use (16) of Ch. 7 and (6). Q.E.D. 


Notation. If f(y) is any polynomial, the remainder when f(y) is divided by 
y"—1 will be denoted by [f(y)], The componentwise product of two 
polynomials 


n-! n~t 
fo)= » fiy' and g(y)= 2, gy 
is defined to be 


f(y) * gy) = > fey: 


Lemma 21. If a is a binary vector then A(z) is an idempotent in the ring of 
polynomials over GF(2") taken modulo z" — 1, i.e. 


[AG], = AG). 


Proof. A(a') - 0 or 1 from Theorem 20. The result then follows from Lemma 
2. Q.E.D. 


Theorem 22. (Other properties.) 
(i) If c(x) = a(x) * b(x), then 


C(z) = A(z) * B2). 
(i) 


iff 


c(x) = [a(x)b (x)]. 


C(z) = A(z) * B(2). 
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(iii) Similarly 
c(x) = a(x) * b(x) 
iff 
CG) - ADB) 


(iv) The MS polynomial of a cyclic shift of a, (a,, a2, . . . , an-ı, ao), is A(az). 
(v) The MS polynomials of 0 and 1 are 0 and n respectively. 
(vi) An overall parity check on a is given by 


S. a, = A(0)- A. (10) 


Proof of (ii). If c(x) ^ [a(x)b(x)],, then C; = c(a!) = a(a!)b(a!) = A,B; there- 
fore C(z)= A(z) * B(z). Conversely, suppose C; = A,B; for all j. Then we 
must show 


nant n-i n-1 
S ext = [F ax Sa] . di 
From (16) of Ch. 7, the RHS of (11) is 
nai 1 n-i Y + 
Doe 2 alai) (S 2 € b(a’ Ja "x x 
reduced modulo x" — 1. The coefficient of x* in this product is 
< 1 = x 5 1 n-i 
> G 2a aea * (12 b ON) 
-1 
=a, 2a alai) > bla’ i ait. 


By Lemma 6 of Ch. 7 the inner sum is zero unless j = J, so this expression 
becomes 


Mi 


a(a')b(a')a* 


3 ~ 3 =] 
tol tow 
- o ~~ CQ 


ll 
= |— = |= = |= 
| M 

> 

& 

R 


M 
e 


which by Theorem 20 is equal to c,, the coefficient of x“ on the LHS of (11). 
The proof of the rest of the theorem is straightforward and is omitted. 
Q.E.D. 
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We now examine more carefully the mapping between a(x) and its Matt- 
son-Solomon polynomial A(z). (See Fig. 8.6.) 

Let T(x) be the set of all polynomials in x with coefficients in GF(q"), of 
degree <n. T(x) can be made into a ring (see 82 of Ch. 7) in two ways. 
Addition is as usual in both rings. In T(x)o the multiplication of two 
polynomials is performed modulo x"—]1, ie. a(x)- b(x) - [a(x)b(x)]. In 
T(x)« the product of a(x) and b(x) is the componentwise product a(x) * 
b(x), as defined above. 

Then the mapping which sends a(x) into its Mattson-Solomon polynomial 
A(z) is, from Theorem 22, a ring isomorphism from T(x)o onto T(z)«, and 
also from T(x)« onto T(z)o. The inverse mapping is given by Equation (9). 
[A ring homomorphism is a mapping q« from a ring A into a ring B which 
preserves addition and multiplication, i.e. g(a,+ a;) = e(ai)) * e(a;) and 
(aia) = q(a)e(5;). If e is 1-to-1 and onto it is called a ring isomorphism.] 

In the binary case we can say a bit more. Set q" — 2", and let S(x) be the 
subset of T (x) consisting of polynomials with coefficients restricted to GF(2). 
Define S(x)o and S(x)« as before. Any binary polynomial a(x) € S(x)x is an 
idempotent, i.e. satisfies a(x) * a(x) = a(x). From Lemma 21, a(x) is mapped 
into an idempotent A(z) in T(z)e. Note that the idempotents of T(z)o form a 
ring E(z)o say, for in characteristic 2 the sum of two idempotents is again an 
idempotent. In fact E(z)o is the image of S(x) « under the MS mapping. For 
the inverse image of an idempotent of T(z)o is in S(x)« by Theorem 20 and 


Lemma 2. Thus S(x)« —, E(z)o is a ring isomorphism. 

Let E(z)« be the ring consisting of the polynomials in E(z)o but with 
componentwise multiplication. (Note that the elements of E(z)« are not 
idempotents under componentwise multiplication.) Then So ———9 E « is also 
a ring isomorphism - see Fig. 8.6. 


T(x) = polynomials in x, coefficients from GF(q"), degree « n. 
T(x) = (TG), +, *) —— To 7 (TG). +.) 
T(x)o = (TQ), +.) ==> TG)» = (TG). +, *) 

If q — 2, and S(x) = subset of T(x) with coefficients from GF(2): 


S(x)» —S> E(2)o = idempotents C T(z)o 


S(x)o ——> E(z)« CT(z)x. 


Fig. 8.6. The Mattson-Solomon mapping. 








Ch. 8. $6. The Mattson-Solomon polynomial 243 


Binary linear codes are the linear subspaces of S(x)o (or S(x) «), hence of 
E(z)« (or E(z)o). Binary cyclic codes are the ideals in S(x)o, and become 
ideals in E(z)«. An ideal in S(x)o consists of all multiples of a fixed 
polynomial, as we saw in Theorem 1. Ideals in E(z)« also have a simple 
structure, as follows. 


Lemma 23. An ideal in E(z)« consists of the set of all polynomials A(z) — 
E A a! such that A, —:::— A, =0 for some fixed subscripts i... , i. 


Proof. Let € be the image of this ideal in S(x)o. Since € is also an ideal, it has 
a certain set of zeros a^,..., a^ in GF(2"7)- i.e. the zeros of the generator 
polynomial of € — such that 


a(x)€ €iff a(a') 2:::— a(a*) 20 
iff A, =--- =A, =0. Q.E.D. 


For example a narrow-sense BCH code of designed distance 6 is the ideal of 
E(z)*x with A, —A;—7:-:— A44, =Q. 

In Ch. 12 it will be shown that Goppa codes can be described in terms of 
multiples of a fixed polynomial in E(z)o. But ideals in E(z)o are not par- 
ticularly interesting - see Problem 50. 


Problems. (48) For n =3 show that E(z)o consists of 0, 1, z -z2 l+z+2z’, 
az - a?^z, az * az, 1c az * a?zi, 1 a?z * az?. 


(49) Let »£ be a "cyclic code” in E(z)o, i.e. a subspace of E(z)o such that 
A(z) € & > [zA(z)], € £. Show that x = {0} or Hw = (0,1 cz +z? +: +z}. 

(50) Let sf be an ideal in E(z)o. Show that the corresponding code in 
S(x)« consists of all vectors which are zero in certain specified coordinates. 
Thus these codes are not interesting. (Hint: work in S(x)«.] 


Locator polynomial. Suppose the vector a —(a59,4,,...,0,.), a; € F, has 
nonzero components 


Qis 05. .... 4; 


w 


and no others, where w — wt (a). We associate with a the following elements 
of F: 


X535 o X. ‘ 


ll 
R 
us 


called the locators of a, and the following elements of F, 
Yi-a,..., Y.- ai. 
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giving the values of the nonzero components. Thus a is completely specified 
by the list (X,, Y,),...,(X., Y.). Of course if a is a binary vector the Y;'s are 
l. 
Note that "E 
a(a!) - A, = Y YX. 
i=l 


Definition. The locator polynomial of the vector a is 


(The roots of o(z) are the reciprocals of the locators.) Thus the coefficients o; 
are the elementary symmetric functions of the X;: 


o,=—(X,+- ++ X.) 
g;— XXX, X Xs+: + XX, 


oy = (—I)"X,- + Xu 


Generalized Newton identities. The A;’s and the o;'s are related by a set of 
simultaneous linear equations. 


Theorem 24. For all j, the A;’s satisfy the recurrence 
Aj+w + on. t: f -+o0,A; — 0. (12) 


In particular, taking j —1,2,...,w, 


Ay Awi < Ai\ M Awst 
Ans: A. . A3 Or ERN ne (13) 
Áw- A2w-2°°* A. ow Aw 


Proof. In the equation 


[Jd -Xz)=1+02+---+0.2" 


i=l 
put z=1/X; and multiply by Y;Xi*": 
YXP” + o YX "+--+ + 0. Y;XI-0. 
Summing on i — 1,...,w gives (12). Q.E.D. 
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Problems. (51) Let a be a vector of weight w. Show that 
Ay o k Ai 


A» A»-2 2m A, 


is nonsingular if v = w, but is singular if v > w. 
(52) The usual form of Newton's identities. Let X,, 
minates, and 


e(o-[[a-xa- X oz, 


fm 
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..., X. be indeter- 


where c; is an elementary symmetric function of the X;, 7) — 1, and o; = 0 for 


i 7 w. Define the power sums 


P, => X, for all i. 


r-i 


(a) If P(z) 2 27, Piz‘, show that 
o(z)P(z)+ za'(z) =0. 
(b) By equating coefficients show that 


P,+o0,=0 
P,+0,P,+20,=0 


and, for i>w, 


P,+o,Pi-,+::-+o.P;_, =O. 


Observe that (15) agrees with (12) in the case that all Y; 
binary vector). 


(14) 


(15) 


are l (e.g. if a isa 


(53) Suppose the X, belong to a field of characteristic p. For s fixed, show 


that P; =0 for 1</<s if o; = 0 for all / in the range 1 = 
divisible by p. 


(54) In the binary case show that Equations (14), (15) 
l 0 0 0 0-0 c, 
A3 A, 1 0 0 0 02 
A4 A; Aa A: 1 0 09,3 = 
Aw-2 Ao-3 NS epis A.- o, 


(55) (Chien.) Use Equation (12) to give another proof 


lss which are not 


imply 


As (16) 


Aw-1 
of the BCH bound. 
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(56) (Chien and Choy.) In the binary case (q =2) show that the MS 
polynomial of a(x) can be obtained from the locator polynomial o(z) of a(x) 
by the formula 


. Z(z" + Doz) 
~ orl) 


A(z) 


+(z"+1)w, 
where or(z) = z”o(1/z) and w = a(1) = wt(a). Hence og(z)A(z) =0 mod z^ + I. 


Application to decoding BCH codes. Equations (12)-(16) are important for 
decoding a BCH code of designed distance 8, as we shall see in the next 
chapter. In this application a is the error vector and the decoder can easily 
find A,..., As... Then Equations (12)-(16) are used to determine o(z). To 
obtain the Y;'s, define the evaluator polynomial 


o(2) - o2) * > zx Y, [T0 - X2. (7) 


Once w(z) is known, Y, is given by 


Y, = ox /T] (1— X,X7") 


= ~ Xo(Xi Mo'(X;). (18) 
Theorem 25. 
w(z) = (1+ S(z))o(z), (19) 
where 
S(z)= È Azi. (20) 
Note that since deg w(z) «dego(z) - w, only A,..., A, are needed to 


determine w(z) from (19). 





Proof. 
aa" x EP. 
-e4 Y, 3 ex) 
-]1-4S(z). Q.E.D. 


The weight of a vector. 


Theorem 26. The weight of a is n — r. where r is the number of n'^ roots of 
unity which are zeros of the Mattson-Solomon polynomial A(z). 
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Proof. From (9). 
Corollary 27. If a has MS polynomial A(z), then wt (a) 2 n — deg A(z). 
We use Theorem 26 to give another proof of the BCH bound. 


Theorem 28. (The BCH bound.) Let € be a cyclic code with generator 
polynomial g(x) =Tiex(x —a') where K contains a string of d—1 con- 
secutive integers b, b -- 1,..., b c d —2, for some b. Then the minimum weight 
of any nonzero codeword a € € is at least d. 


Proof. Since a(x) is a multiple of g(x), by hypothesis a(a/) 20, for b «jx 
b * d —2. Therefore 


A(z) = a(a)z"' *--- t a(a* ')z" **' + a(g?*4 z". nn + ala"). 
Let 
A(z) = z^" A(2) - (a(a)2*? +- -- - a(a* yz" — 1) 
= a(a**^)z7* +. -+ a(a")z*^ + a(a)z*? --- a(a* ). 
Clearly the number of n“ roots of unity which are zeros of A(z) is the same as the 


number which are zeros of A(z). This number is «deg A(z) « n — d. Thus the 
weight of a is at least d by Theorem 26. Q.E.D. 


Mattson-Solomon polynomials of minimal ideals. In the remainder of this 
section we restrict ourselves to binary codes. 
The MS polynomials of 6,, 07, x'0* are respectively 
X zr, 5 zi, > azi, 
jec, jec, jECy 
We see that it is easier to work with 6* than with 6,. Provided we are careful, 
the notation can be simplified by using the trace function. Let 


T(z) = 2+ 27+ 27 +27 +a, 


where exponents greater than n are immediately reduced modulo n. For 
example, if n = 15, T4(z?) = z?  zé- z" + 2°, and not Z? 4 2°+ z" + z”. Failure 
to observe this convention leads to errors. 


Problem. (57) For n 231, show that T;(z*)=@? when z=o, w being a 
primitive cube root of unity. 

Thus the MS polynomials of 8*, x'0* and x'0, are respectively T,,(z*), 
Tm. (a "z^) and Tah, (a*^"z"7?). 
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We can also handle minimal ideals which are not just cyclic shifts of the 
idempotent. Let M, = (0*) be any minimal ideal. The elements a(o ^) where 
a(x) € 4t, form a subfield F, of GF(2") which is isomorphic to GF(2"*), as in 
Theorem 9. Therefore the MS polynomial of a(x) is 


A(z) = > a(a ?)z! = T,,,(a(a™*)2") 
= T..(Bz*). 


where B = a(a ") is an element of F.. Exponents of B in the latter expression 
are reduced modulo 2": — 1, and the exponents of z are reduced modulo n. 


Mattson-Solomon polynomials of codewords. Any element a(x) € R, can be 
written, by Theorem 7 (iii), as 


a(x) = >} a.()6* (x). (21) 


sO 
Its MS polynomial is 
A(z) = © Tn(B:2"), Bs © Fs, (22) 
sO 
where s runs through a complete set of coset representatives modulo n. a,(x) 


may be zero, in which case fl, is also zero. 
Note (21) may be written as 


a(x) ^ Y, x^6*(x), (23) 


where S is a set of coset representatives with repetitions allowed. The MS 
polynomial of a(x) is then 


A(Zz)= > Y (azy. (24) 
sES jec, 
Let € be a cyclic code whose nonzeros are in C_,,,..., C-u A codeword 


a(x) in € can be written as 


a(x) = 2, a, Q0). 
and has MS polynomial 
AG) 7 2, T. (Biz")), 


where n; = |C_,,| and B; € £,. A codeword in €+ has the form 
b(x)= Y, b.(x)6.(x), 


with MS polynomial 
B(z)= 2, Ticu(B.z"™). 
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Example. Codes of length 15 (cf. Problem 7). The [15,4,8] simplex code 
(9*) = (87) consists of the vectors with MS polynomials 
A(z) = T(Bz), B © GF(2’). 
In general the [n = 2" — 1, m, 2"^'] simplex code Sm is given by 
A(z) = T,(Bz), B€GFQ"). (25) 
The [15, 2, 10] simplex code (6;) = (0*) consists of 0, 05, x6., x'0.. The MS 


polynomials are 
T4Bz) BE{0,1,a*,a"}=GF(2’). 
Le. 


0, z+2"°, aizit+a'z", alzi + az". 


The [15, 4, 6] code (6;) = (07) consists of the MS polynomials 
T(Bz), BeGFQ?. 
The [15, 11, 3] Hamming code consists of all vectors of the form 
ac8s(x) + aix^i(x) + fa(x)Ox(x) + mo 


where ao, a1, as E GF(2); i: is E {0,1,..., 14}; deg f(x) <3. The MS poly- 
nomials are 


E LR. Bes. 


A similar expression holds for any Hamming code. 


Problem. (58) Show that the [23, 12, 7] Golay code consists of the vectors with 
MS polynomial 


a*Tu(Bz), a €GF(2), B E GF"). 


Another formula for the weight of a vector. Theorem 31 gives a formula which 
will sometimes enable us to find the weight distribution of a code explicitly. 
The proof depends on an identity (Theorem 29) satisfied by the MS 
polynomial. 

Let A'(z) denote the derivative of A(z). We note that in characteristic 2, 
zA'(z) consists of those terms yz! in A(z) for which j is odd. 


Theorem 29. The MS polynomial A(z) of any vector a € R, satisfies 


A(z)(A(z) + 1) 


2r z" +1 
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Proof. By (24) it suffices to prove the theorem for A(z) = Zjec, (BzY, where 
B" = 1. Let [r], denote the remainder when r is divided by s. Then 
m,-I 


A(z) = > (Bz), 
i-o 
AG - $ (Bz) 
i-0 
Note that the exponents in A(z)’ may be greater than n. If [s2'], is an even 
number, then 
[s2'], = 2[s2"'], 


and in A(z} + A(z) the term with exponent [s2‘], in A(z) cancels the term 
with exponent 2[s2'"], in A(z)’. If the exponent [s2'], is odd, then 


[s2/], = 2[s2'"], — n. 


Combining the term with exponent 2[s2'], — n in A(z) and the term with 
exponent 2[s2'], in A(z) we obtain 


(Bz)*?"^(1-- z") (recall that B" = 1). 


Thus we obtain (z" + 1) multiplied by the terms with odd exponents in A(z), 
Le. zA'(z)(z" + 1). Q.E.D. 


Problem. (59) Check the theorem for n 29, C, = {1, 2,4, 8, 7, 5}. 
Corollary 30. If { is a zero of A'(z) or z"*' * z, then A(£) is O or 1. 


Define 
f(z) = (A(z), z^ t 1), 
fX2) = (A(z), zA'(2), 


where (a(z), b(z)) denotes the monic greatest common divisor of a(z) and 
b(z). Since A(z), A(z)+ 1 are relatively prime, Theorem 29 implies 


A(z)= Bf (z)fAz), for some B € GFQ"), B» 0. 


We have therefore proved: 


Theorem 31. The weight of the vector with MS polynomial A(z) is 
n — deg fi(z) = n — deg A(z) + deg f(z). 


Problem. (60) Let the binary expansion of the weight w of vector 
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à = (aoa: ` * *.d,.) be 
w = l'(a) 2L (a) * A'i(a) +--- 
where I;(a) =0 or 1. Show: 
(i) F(a) s (2) mod 2 [Hint: Lucas’ Theorem 28 of Ch. 13]; 
(ii) (2 = Zi aia;; 
(iii) Hence if w is even, 


(n—1y2 


Ta) = » Aini 


$7. Some weight distributions 


Theorem 31 is useful when it is easy to find the degree of f(z). 


Examples. (1) The simplex code of length n = 2" — 1. From (25) any codeword 
has MS polynomial 


A(z) = x (B27. 


Hence zA'(z) = Bz and f(z) = gcd(A(z), zA'(z)) » z. The degree of f2(z) is 1, 
and the weight of the codeword is 


2"7—1-2"7^7 1-2", 
(2) The degenerate simplex code (87) of length n = 2" — 1, where m = 2u = 4 


is even, and s = 2" + 1. 
The cyclotomic coset C, is 


s=2"+1, 24142 2D! 


of length u. (Check for n= 15, s=5, and n = 63, s=9.) The ideal with 
idempotent 0* has dimension u, and its nonzero codewords are x'07, O<j.< 
2" — 2. The MS polynomial of a typical codeword is 


A(B, z) = p (Bzy, 
of degree 2"! - 2""', Thus zA'(B, z) = B'z', and f.(z) = z^ has degree 2" + 1, so 
the weight of each codeword is 
224-71 + Deck. 


(3) The code with idempotent 6* + 67, with n, m, s as in Example 2. This 
code has dimension 3u, and a typical codeword is 


ax'0t + bx'@*, 
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where a,b ©GF(2) and Oxix2*—2, 0xjx2"—2. If a=1, b=0 this 
codeword has weight 2™—'. If a=0, b=1 the weight is 2*-'+2""'. If 
a=b = | we have acyclic permutation of a codeword of the form 


x'o* + 0*. 
The MS polynomial is 
A(B.z) - > (Bz) + > z’, 
sec, IEC: 
A'(B,z)=B+2z", wherer= 2". 
We shall find f.(z). Now A'(B,z)=0 for B —z', or 
z 52" -(zyc-p-. 
Then 
A(B, B') - TaB 4 p^. 
Now B'*' is in GF(2") by Theorem 8 of Ch. 4. Thus 
Ta (B) = 2T,(B"') = 0, 
Y p? Es p^" + p". PAM + Br, 
jee, 
= T.(B"*'). 


This is either 0 or I, depending on f. If A(B, 3") — O, the degree of f;(z) is 
r+ 1 and the codeword has weight 2?^' - 2^'. If A(B, B') =1, the degree of 
fAz) is 1, and the codeword has weight 27“~'— 2“-'. Thus we have: 


Theorem 32. For m = 2u = 4, and s — 2" + 1, the code with idempotent 0f + 0* 
has three nonzero weights, namely v, 2 27^! —2" , z,2 2", p,22 1 42074 


Theorem 33. The weight distribution of this code is: 


Weight Number of Codewords 
0 1 
224-1 — 2"! PA OA pa 1) 
274-1 2?« rm l 
Juoi p gua 234-1. pew 4 9u—1. 


Proof. Since the code contains X,,, its dual is contained in Xn. Thus d' = 3 and 
we may apply Theorem 2 of Ch. 6. Q.E.D. 


Problems. (61) Show that the dual code has minimum distance 3. 
(62) Let € be the code with idempotent 6, + 0* + 0*, formed by adding the 
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codeword 1. Show that € has dimension 3u+1, and €* has minimum 
distance 4. Let € be the code obtained from € by adding an overall parity 
check. Show that the weight distribution of € is: 


Weight Number of Codewords 
0 l 
qi= 272] A,, = 2" — 2 
q= 2! An = 2(2% — 1) 
T= a 941 An = gpu 5024 
274 l. 


Some Examples. The [16,7, 6] code obtained by extending the cyclic code with 
idempotent 6o + 6* + 0*: 


i: 0 6 81016 
A:: 1 48 3048 | 


The [64, 10,28] code obtained by extending the cyclic code with idempotent 
0o 0t + OS: 


i: 0 28 32 36 64 
A: 1 448 126 448 1 


Example. (4) The code of length n = 2" — 1, where m = 2u + lis odd, which has 
idempotent 0¥ + 0$, | — 2" +1. Now l and n are relatively prime and 


CQ 2(2^-12"'*42,...,2^ x22" € 1,...,2^ «2^ 


of length m. (Check for n 231, 1— 5, and n = 127, 1-9.) The code has 
dimension 2m, and consists of the codewords 


ax'0t + bx'0*. 


If either a or b is zero, the codeword has weight 2". Every codeword with 
a =b = l is a cyclic permutation of a codeword of the form 


x'0t + or. 
The MS polynomial, of degree 2?" + 2", is 
A(B, z) = 2 (Bz) + 2 zi, BeGFQ") 
Let r-2" then A'(B,z)=B+2' +2" =(B"+z+27). Then 7(B,z)= 


B” +z+2’ may have 0, 1 or 2 zeros in common with A(z), and the degree of 
f(z) is correspondingly 1, r+‘l, or 27+ 1. 
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Thus the only possible weights are 
2" —1-(2™ +2" - D = 27 -2, 
2" 1- (2^ +2" —1— 2") 2^, 
2m—] - (2^ +2" — 1-2") =2™ 42" 
This argument does not prove that both extremal weights occur. However, 


since €^ has minimum distance =3, we may use Theorem 2 of Ch. 6 to 
calculate the number of codewords of each weight: 


Theorem 34. The [n = 2" — 1,2m,2™ — 2"] cyclic code, where m = 2u +1, with 
idempotent 6* + 6*, 1 — 2" +1, has the following weight distribution. 


Weight Number of Codewords 
0 1 
T =2"-—72" An = n(27 7 + 2874), 
7-2" An = n(2* +1), 
Fa 022^ +2“ An = n(2*47' —2"7'), 


Some examples of Theorem 34 are shown in Fig. 8.7. 


Problem. (63) Prove the following for n = 31, m = 5. 

(1) If T(a)=0 then a =A+A?=(A +1) t (A * IY for some A in GF(2°). 
Thus (B, z), a = B”, has two zeros in GF(2'). 

(2) In this case exactly one of these is a zero of A(B, z). 

(3) If T(B)=1, 7(B. z) is irreducible over GF(2’). 


Corollary 35. €^ has minimum distance 5. Thus the codewords of each weight 
form a 2-design with parameters 


Ti(ti— 1) 


—))-2,5324 ju 
“OR Ora 2 0-2 - D, 


À,— A 


= T(T2— 1) 2-628 
Àn gz An (2™ Em D" —2) 2 (2 + 1), 
An € An T( Ts = 1) = 2n 4 2" x 1). 


Q" — 1)(2" — 2) 


Problem. (64) Form the [2", 2m +1] code € by adding a parity check to the 
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code with idempotent 6) + 0* + 0*. Show that the vectors of each weight form 
a 3-design and find the parameters. 


(31, 10, 12] i: 0 12 16 20 
A: 1 10.31 1731 6.3] 
[127, 14, 56] i: 0 56 64 72 
A: 1 36.127 65.127 28.127 
[511, 18, 240] i: 0 240 256 272 
A: 1 136.511 257.511 120.511 


Fig. 8.7. Examples of Theorem 34. 


Notes on Chapter 8 


$2. Some of $82,3 follow immediately from ring theory. (See Burrow [212], 
Curtis and Reiner (321]). The group G —- (1, x,...,x" ^!) (of 82 of Ch. 7) has 
order n which is relatively prime to q, so any representation of G over GF(q) 
is completely reducible (Maschke's theorem). Therefore the group algebra FG 
(and so R,) is semisimple. By Wedderburn's first theorem, R, is a direct sum 
of minimal ideals, and each minimal ideal contains a unique idempotent 
generator. For properties of idempotents see Cohen et al. [297], Goppa [540] 
and MacWilliams [874, 882]. 


$3. Theorem 6 is not recommended for machine computation of the primitive 
idempotents. Prange's algorithm ([1077], see also MacWilliams [874]) is 
superior. Problems 14, 15, 17, 22 are due to Goodman [534], Lempel [813], 
Goppa [540] and Weiss [1397]. 


$4. The following papers study the weight distribution of cyclic and related 
codes: Baumert [83], Baumert and McEliece [86}, Berlekamp [113, 114, 118] 
Buchner [208], Delsarte and Goethals [362] for other two-weight codes, 
Goethals [488], Hartmann et al. [613], Kasami [727, 729], Kerdock et al. [759], 
MacWilliams [874], McEliece [940,941,945], McEliece and Rumsey [949], 
Oganesyan et al. [1003-1006], Robillard [1117], Seguin [1174], Solomon [1249], 
Stein et al. [1275] and Willett [1418]. 


§5. Problem 33, Theorem 14, Theorem 16, Problem 43, and Problems 38, 46 
are due to MacWilliams [870,871], Prange [1074-1077], Peterson [1038], 
Kasami et al. [738] (see also Delsarte [344]), and Camion [239] respectively. 
Lin [832] proved Theorem 18 in the case t = 2. 


86. MS polynomials were introduced by Mattson and Solomon [928]; see also 
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Chien and Choy [286], Kerdock et al. [759]; and Dym and McKean [396, Ch. 
4] for the discrete Fourier transform. For Newton’s identities and other 
properties of symmetric functions see David and Barton [331, Ch. 17], David, 
Kendall and Barton [332], Foulkes [447], Kendall and Stuart [756, Ch. 3], 
Turnbull (1342, §32] and Van der Waerden [1376, vol. 1, p. 81]. Problem 55 is 
from Chien [283] (see also Delsarte [346]), Problem 56 from Chien and Choy 
[286], and Theorem 25 is due to Forney [435]. 

Problem 60 is from Solomon and McEliece [1256]. Other properties of the 
I;(a) wil be found there and in McEliece [939, 940]. These properties are 
useful in finding weight distributions. 


$7. For odd m the weight distributions of all codes of length 2" —1 with 
idempotents of the form 6* + 9*, s = 2/ 4 1, have been given in [759]. Berle- 
kamp [118] and Kasami [729] have given very general results on the weight 
distribution of subcodes of second-order Reed-Muller codes, i.e. of codes 
with idempotents >; 0* where j runs through a set of numbers of the form 
2' +1 (see Ch. 15). 








BCH codes 


§1. Introduction 


BCH codes are of great practical importance for error correction, par- 
ticularly if the expected number of errors is small compared with the length. 
These codes were first introduced in Ch. 3, where double-error-correcting 
BCH codes were constructed as a generalization of Hamming codes. But 
BCH codes are best considered as cyclic codes, and so it was not until §6 of 
Ch. 7 that the general definition of a t-error-correcting BCH code was given. 
Naturally most of the theory of cyclic codes given in Chs. 7 and 8 applies to 
BCH codes. 

For example, BCH codes are easily encoded by either of the methods of §8 
of Ch. 7. Decoding will be dealt with in §6 below. 

We recall from Ch. 7 that a BCH code over GF(q) of length n and designed 
distance 6 is the largest possible cyclic code having zeros 

a’, a^*!, dans ate? 
where a € GF(q") is a primitive n™ root of unity, b is a nonnegative integer, 
and m is the multiplicative order of q modulo n (see $5 of Ch. 7). 

Important special cases are b = 1 (called narrow-sense BCH codes), or 
n —q" —] (called primitive BCH codes). A BCH code is assumed to be 
narrow-sense and primitive unless stated otherwise. BCH codes with n= 
q~ 1 (i.e. m = 1, a € GF(q)) are another important subclass. These are known 
as Reed-Solomon codes, and the next chapter is devoted to their special 
properties. 

The following bounds on the dimension k and minimum distance d of any 
BCH code were obtained in $6 of Ch. 7. 
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Theorem 1. (a) The BCH code over GF(q) of length n and designed distance 6 
has dimension k > n — m(ô — 1) and minimum distance d = 6. 

(b) The binary BCH code of length n and designed distance 6 = 2t + 1 has 
dimension k =n — mt and minimum distance d > ô. 


Naturally one wishes to know how close these bounds are to the true 
values of k and d, and we shall study this question in $82 and 3 of this 
chapter. The answer is roughly that they are quite close. However the precise 
determination of d is still unsolved in general. 

$4 contains a table of binary BCH codes of length = 255. In this range (and 
in fact for lengths up to several thousand) BCH codes are among the best 
codes we know (compare Appendix A). But as shown in $5 their performance 
deteriorates if the rate R — k/n is held fixed and the length approaches 
infinity. 

There is a large literature on decoding BCH codes, and very efficient 
algorithms exist. $6 contains an introduction to the decoding, giving a simple 
description of the main steps. 

The final step is to find the zeros of the error locator polynomial. If this 
should be a quadratic equation (over a field of characteristic 2) the zeros can 
be easily found, as we see in $7. The results of $7 are also used in 88 to show 
that binary double-error-correcting BCH codes are quasi-perfect. 

In 89 a deep result from number theory is used to show that if € is a BCH 
code of. designed distance not exceeding about n, then all the weights of 
€* are close to in. 

The weight distributions of many codes can be observed to have the 
approximate shape of a normal probability density function. §10 gives a 
partial explanation of this phenomenon. 

In the final section of the chapter we describe an interesting family of 
non-BCH triple-error-correcting codes. 


Notation. € will usually denote an [n, k, d] (narrow-sense, primitive) BCH 
code over GF(q) of designed distance ô. As usual a € GF(q") is a primitive 
n* root of unity, where m is the multiplicative order of q mod n (see $5 of 
Ch. 7). 


Let a = (a5,...,8,.:) be a vector over GF(q) of weight w, with 
n-t 
a(x)= > ax’. 
i-0 
If a,,..., a are the non-zero components of a, we define the locators of a to 
be X, —- a^, r=1,...,w, the locator polynomial to be 


o(z)=[[a-xX2z)=> 0,2", (1) 


r=t 
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and let Y,=a,, r=1,..., w. Also the power sums are defined by 
Ai a(a')s > Y.X, T= 0,1,5... (2) 
r=t 


(See §6 of Ch. 8.) In particular, if a belongs to @, then 
A,=0, forlxixó- l. 


$82. The true minimum distance of a BCH code 


The first three theorems say that under certain conditions the true 
minimum distance of a BCH code is equal to the designed distance. The last 
theorem gives an upper bound on the true minimum distance for any BCH 
code. The first two results are easy: 


Theorem 2. (Farr.) The binary BCH code of length n = 2" —1 and designed 
distance 6 — 2t +1 has minimum distance 2t +1 if (a) 


$ E ') 22", (3) 


i-0 
or if (b) 
m > log;(t + 1)! - 1 — t logX(t/e) as t>~. 


Proof. (à) By Corollary 17 of Ch. 8 the minimum distance d is odd. Suppose 
d z 2t +3. The dimension of the code is = n — mt, so from the sphere-packing 
bound 


yo (Mar 


But this contradicts (3). That (b) implies (a) is a routine calculation which we 
omit. Q.E.D. 


Example. Let m — 5, t — 1, 2, 3. Then it is readily checked that 
tl 
Y (7) 22". 
fest 
Thus as shown in Fig. 9.1, the codes of length 31 with 6 = 3, 5, 7 actually have 


d — 3, 5, 7. 


Theorem 3. (Peterson.) If n = ab then the (not necessarily primitive) binary 
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BCH code € of length n and designed distance a has minimum distance 
exactly a. 


Proof. Let a be a primitive n" root of unity, so that a” #1 for i <a. Since 


x"—12-2 (x* — D(I 4 x* 9 x* e 4 x*^*), the elements a, &?7,..., o? ' are 
not zeros of x^ — 1 and are zeros of 1 +x? 4: - - - x ^^"^, Therefore the latter 
is a codeword of weight a in €. Q.E.D. 


Example. The code of length 255 and designed distance 51 has minimum 
distance exactly 51. 


The following Lemma characterizes locator polynomials of vectors of 0’s 
and I's (over any finite field). 


Lemma 4. Let 
o(z) = S oz 
i-0 


be any polynomial over GF(q"). Then o(z) is the locator polynomial of a 
codeword a of 0’s and V's ina BCH code over GF(q) of designed distance 8 iff 
conditions (i) and (ii) hold. 

(i) The zeros of o(z) are distinct n™ roots of unity. 

(ii) o; is zero foralliin therange | <i € 8 — 1 which are not divisible by p, where 
q is a power of the prime p. 


Proof. If o(z) is the locator polynomial of such a vector a, then A, = 0 for 
1</<68-— 1. The result then follows by Problem (53) of Ch. 8. On the other 
hand suppose (i), (ii) hold, and let X,,..., X, be the reciprocals of the zeros 
of o(z). Then the vector a with 1’s in these locations satisfies A,;=0 for 
1«/1«68-—1 (again by Problem (53)) and so belongs to the BCH code. 
Q.E.D. 


We use this lemma to prove 


Theorem 5. A BCH code of length n = q” — 1 and designed distance ô = q^ —1 
over GF(q) has true minimum distance 6. 


Proof. We .shall find a codeword of weight 6. Let U be an h-dimensional 
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subspace of GF(q"). By Lemma 21 of Ch. 4 


Lz) = [[ 2-8) 
EU 
= z” +h +. " ++ [oz (4) 
is a linearized polynomial over GF(q"). Consider the vector a which has as 


locators all nonzero B in U, and has 1’s in these locations. The locator 
polynomial of a is therefore 


o(z)= Jo-e» 
BEU 


-z"L(z^') 
qh 

= > ez, 
i-o 


where c; = 0 if i is not equal to q" — q’, forj 20, 1, ..., h. By Lemma 4a isa 
vector of weight q^ —1 in the BCH code of designed distance q^ — 1. 
Q.E.D. 


Example. Let q = 2, m = 3, h = 2- i.e. consider the BCH code of length 7 and 
designed distance 3. This is the (7, 4, 3] Hamming code. The proof of the 
theorem shows that the code contains the incidence vectors of all lines in the 
projective plane over GF(2) (see Fig. 2.12 and Equation (24) of Ch. 2), and 
hence the true minimum distance is 3. 


Theorem 6. The true minimum distance d of a primitive BCH code € of designed 
distance 6 over GF(q) is at most q6 — 1. 


Proof. Define do by q"'xó«d,- q" —1. Since BCH codes are nested, € 
contains the code of designed distance d,, which by the previous theorem has 
true minimum distance d,. Therefore 


d &d,x qó — 1. Q.E.D. 


Although a number of results similar to Theorems 2, 3, 5 are known (see 
notes at the end of the chapter), the following problem remains unsolved: 


Research Problem (9.1). Find necessary and sufficient conditions on n and the 
designed distance 6 for the minimum distance d to equal 6. 


The exact determination of d, in the case d > ô, will be even harder. 
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83. The number of information symbols in BCH codes 


Suppose € is a BCH code over GF(q) of length n = q" — 1 and designed 
distance ô. In Theorem 1 we saw that the number of information symbols, 
which we denote by I(n, 5), is at least n — m(8 — 1). But the precise number 
depends on the degree of the generator polynomial g(x), for 


I(n, 85) = n — deg g(x). 


The degree of g(x). By definition g(x) is the lowest degree polynomial having 
a, a,..., @°"' as zeros, where a is a primitive element of GF(q”). If a‘ isa 
zero of g(x) so are all the conjugates of a‘, namely a“, a^,.... The number 
of different conjugates of a‘ is the smallest integer m; such that iq™ =i 
(mod q” — 1). In the notation of $3 of Ch. 4, mi is the size of the cyclotomic 
coset containing i. 

There is an easy way to find m,. Write i to the base q as 


i = i.q" sae t iq + lo, 
where 0x i, € q— 1. Then 
qi = imn-2q”' +-+ ig? + loq t bi (mod q" = 1). 


Le. the effect of multiplying by q is a cyclic shift of the m-tuple (in-1,..., i: 
ig). 

For example, suppose q=2, m=4, q”—1=15. Then i=1 gives the 
4-tuple (0001), which has period 4 under cyclic shifts: (0001), (0010), (0100), 
(1000), (0001),...; hence m,=4. Equivalently the cyclotomic coset C,= 
{1,2,4,8} has size 4. On the other hand i = 5 = (0101) has period 2: (0101), 
(1010), (0101),...; hence m; = 2. Thus C; = {5, 10}. 

If i is sufficiently small compared to n, then m; = m. More precisely, we 
have 


Theorem 7. For q —2 and n —2" — |, the cyclotomic cosets Ci, Cs, Cs,..., C, 
are distinct and each contain m elements, provided 
i «27 4] (5) 


(where [x] denotes the smallest integer =x). Thus m, = m if i satisfies (5). 


Proof. (i) We first prove that these cosets are distinct. Consider the coset C; 
The binary expansion of i written as an m-tuple is 
00---OXX --- Xl, X-0or I. (6) 
ee 
[m/2] 


Ch. 9. §3. information symbols 263 


Could there be a cyclic shift of (6) which also begins with [m/2] zeros and 
ends with a 1? Obviously not. Therefore there cannot be an odd j < 27?' 4 1 
with C, = C; 

(ii) The number of elements in C, is the number of distinct cyclic shifts of 
the binary expansion of i written as an m-tuple. It is easy to see that all m 
cyclic shifts are distinct if i < 2"?! +1. Q.E.D. 


Example. n = 63, m — 6. The first few cyclotomic cosets are: 


binary 6-tuple 


Ci = (1, 2, 4, 8, 16, 32}, 000001 
C = (3, 6, 12, 24, 48, 33}, 000011 
C; = (5, 10, 20, 40, 17, 34}, 000101 
C; = (7, 14, 28, 56, 49, 35}, 000111 
C, = (9, 18, 36}. 001001 
Indeed, C,,..., C; are distinct and have size 6, as predicted by the theorem. 


This theorem has an immediate corollary. 


Corollary 8. Let € be a binary BCH code of length n — 2" — 1 and designed 
distance 5 = 2t t 1, where 


2H 1420 ET, 


Then dim € = 2" — 1- mt. 


This corollary applies when the designed distance 6 is small. For arbitrary 
ó we have 


Theorem 9. The degree of g(x) is the number of integers i in the range 
1xíxq"-—1 such that some cyclic shift of the q-ary expansion of i (written 
as an m-tuple) is «€ 6-— 1. 


Proof. The condition simply states that a’ is a root of g(x) iff a’ has conjugate 
a^ with 1«iq' «6-— I. Q.E.D. 


For example, consider the binary BCH code of length 15 and designed 
distance ô = 5. g(x) has roots a, a”, a‘, a?, and a’, ać, a°, a’. The exponents i 
are 

0001, 0010, 0100, 1000, 
0011, 0110, 1100, 1001, 
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and deg g(x) = 8, verifying the theorem. The number of information symbols 
1(15, 5) = 15-8 =7. 

But if 8 — 7, g(x) has the additional zeros a^, a", with i = (0101), (1010), 
and deg g(x) = 10, I(15,7) = 15 — 10=5. 


Problems. (1) How many information symbols are there in the binary BCH 
codes of length 63 and with ô = 5, 7, 11, 21, and 27? 
(2) Repeat for the BCH codes over GF(3) of length 80 and 8 = 4, 7, and 11. 


Designed distance 5 = q^. In the rest of this section we shall find an upper 
bound to I(n, 8) in the special case where n =q” — l1 and 6 = q^. Note that 





A 
= Bae "anl q”, asÀoc, 

where r 2» m — 4. 

In the m-tuple io, i:,..., in-1 a maximal string of consecutive zeros is called 
a run. We distinguish between straight runs and circular runs. For example 
0100100 contains two straight runs of length 2 and one of length 1, and one 
circular run of length 3 and one of length 2. 

Theorem 9 now implies: 


Lemma 10. The degree of g(x) is the number of integers i in the range 
1xisxq"-—1 such that the q-ary expansion of i (written as an m-tuple) 
contains a circular run of length at least r. 


Proof. i = inq" :--tüq-isissq'-liffiü =i,4,=---=i,_-,=0, ie. iff 
(in-1 °° * hio) starts with a run of length at least m — A =r. Q.E.D. 


From now on let r be fixed. Let s(m) denote the number of m-tuples of 
elements from (0, 1,..., q — 1} which contain a straight run of length at least 
r; and let c(m) be the number of m-tuples which contain a circular run of 
length at least r but no such straight run. Then from Lemma 10, 


deg g(x) = s(m) - c(m) - 1 (7) 
(the last term because i = 0 is excluded), and 
I(q" —1,q*) 2 q" — s(m) - c(m). (8) 


Thus the problem is reduced to the purely combinatorial one of determining 
s(m) and c(m). For our present purpose we can drop c(m): 


I(q"—1,q^) sq" —s(m) (9) 
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(but see Problem 3 at the end of this section). It is easy to find a recurrence 
for s(m): 


Lemma 11. 
s(k)=0 for OsSk<yr, s(r)=1, (10) 
s(m)—7qs(m—1)*(q—D(q" "' s(m-r-D) m>r. (41) 
Proof. (10) is immediate. To prove (11), observe that an m-tuple counted in 


s(m) either contains a run of length r in the last m — 1 places (giving the first 
term), or else must have the form 


Qee Ob, disco T ig 


where i,.,-; #0 and there is no run of length r in the last m—r-—1 places. 


Q.E.D. 
If we let e(m) = q" — s(m) then 
I(q" - 1, q*) S e(m) (12) 
and, from Lemma 11, 
g(k)=q* for Osk«r, e(r)72q'-l, (13) 
e(m)- qe(m - D) -(q- De(im-r-l1), m>r. (14) 
The most general solution to the recurrence (14) is 
eun X ap’. (15) 
where do,..., a, are determined by the initial conditions (13) and po,..., p, 
are the real or complex roots of 
p" 7 qp' - (q - V). (16) 


Now all the roots of (16) satisfy |p;| < q. This is clearly true for real roots. For 
a complex root p = be”, 0# 0, it is even true that b < I. For suppose b > I. 
From (16), using the triangle inequality, 


b™*'+q-—1> qb’. (17) 
On the other hand (16) also implies 
p'-(q-W(p' '+--- +1), 
b’ S(q—1)(b"'+--- +1), 
b" qb' -(q- 1), 


which contradicts (17). 
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We have therefore proved: 


Theorem 12. (Mann.) The number of information symbols in the narrow-sense 
BCH code over GF(q) of length q" — 1 and designed distance q^ is 


I(q" -Lq*) S 2, apr 


where r = m — À, und do,..., Gr, po ..., p. depend on r (but not on m), and 
satisfy |p;| « q. 


We shall use this result in $5. 


Problem. (3) (Mann. Show that I(q" —1,q^) can be found exactly, as 
follows: 
(i) If m = 2r then 


c(m) 7 (q— 1? $ r- k-1){q"-"*? -—s(m—r—k —2)}. 
(ii) Hence (12) may be replaced by the equality 
I(q" — 1, q*) = g(m) - (q - D E r-k- l)e(m — r - k - 2), 


for m z 2r. 
(iii) The roots of (16) are po 1, p, real with qr/(r + 1) € p, € q, and |p| « 1 
for 2si<r. 
(iv) In (15), 
piKoi — 1) 


a; = T E 
pU -(q- Ur 


(v) Finally, after some manipulation, 
Iq” - a9 2 oF, 


= nearest integer to 1+ př, for large m. 


84. A table of BCH codes 


Even though the performance of BCH codes deteriorates for large n if the 
rate is held fixed (as we shall see in $5), for modest lengths, up to a few 
thousand, they are among the best codes known. Figure 9.1 gives a table of 
primitive binary BCH codes of length «255. For each code we give the length 


7 4 3 127 29 43* 


22 47 
15 il 3 15 55 
5 8 63 
5 7 
255 247 3 #d=6+2 
31 26 3 239 5 *lower bound on d 
21 5 23 7 
16 7 223 9 
11 11 215 11 
6 15 207 13 
199 15 
63 57 3 191 17 
51 5 187 19 
45 7 179 21* 
39 9 171 23 
36 11 163 25* 
30 13 155 27 
24 I5 147 29* 
18 21 139 31 
16 23 131 37* 
10 27 123 39* 
7 3 115 43* 
107 45* 
127 120 3 99 47 
113 5 9] $1 
106 7 87 53* 
99 9 79 55 
92 ll 71 S9* 
85 13 63 61* 
78 15 S5 63 
47 85 
71 19 45 87* 
64 21 37 91* 
57 23 29 95 
50 27 21 111 
43 31# 13 119 
36 31 9 127 


Fig. 9.1. A table of BCH codes. 
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n, dimension k, and minimum distance d. If d is marked with an asterisk, it is 
only a lower bound on the true minimum distance. 

For only one code in this Fig., the [127, 43,31] code of designed distance 
ô = 29, is it known that d > ô. However (see Notes), there are infinitely many 
such codes for larger values of m. It is easy to find examples of nonprimitive 
BCH codes with d > ô, as we saw in Ch. 7. 


Interesting nonprimitive BCH codes. Sometimes nonprimitive BCH codes 
(when shortened) contain more information symbols for the same number of 
check symbols than do primitive BCH codes. 

For example, let n — 33. The first two cyclotomic cosets are 


C; = (1,2,4,8, 16, — 1, - 2, — 4, — 8, — 16), 
C; = (3,6, 12, 24, 15, — 3, — 6, — 12, — 24, — 15}. 





In general, if n = 2" + 1, then |C,|=2m, so x^ + 1 has zeros in GF(27"). If B is 
a zero of a minimal polynomial, so is 8^'. Hence any cyclic code of length 
2" +1 is reversible (86 of Ch. 7). 

Let € be the (wide-sense, nonprimitive!) BCH code with generator 
polynomial M?(x)M'"(x)::: M(x). Then g(x) has zeros a‘ for i=0, 
tl...,*(u +1), and so by the BCH bound has minimum distance > 2u + 4. 

For u = 1, € is a [2” + 1,2" — 2m, 6] code, which has one more information 
symbol and the same number of check symbols as the [27,27 — 1 — 2m, 6] 
extended primitive BCH code. (See $7.3 of Ch. 18.) 


Problem. (4) Let T be the permutation (0,1,...,n —1), n odd, and let 
R - (0,n - D(1,n -2),..., ((n —3)2, (n + D/2), ((n — D/2) be the reversing 
permutation. Of course T generates a cyclic group G,={T':0<i <n}. (i) 
Show that T and R generate a group G;— (T', RT':0«i « n) of order 2n. 
This is called the dihedral group, and is the group of rotations and reflections 
of a regular polygon with n sides. Show that 


T"-R'-(TRy-I. (18) 


(ii) Let G be a group containing elements T and R such that (18) holds, and 
T'zlIforlxi«n, Rz I, TR# I. Show that the subgroup of G generated by 
T and R is isomorphic to the dihedral group of order 2n. (iii) A reversible 
cyclic code € of length n is invariant under the group G, generated by 
o: i>2i (mod n), T and R. Show that (a) if m, the multiplicative order of 
2 mod n, is even and n |2”?+ 1, then R = o7"T '', and G; has order mn and is 
the group given in Problem 40 of Ch. 8; but (b) otherwise G3 = (o?T!, o;RT': 
Oxi«cm,0xj-«n) has order 2mn. 
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§5. Long BCH codes are bad 


The Gilbert-Varshamov bound (Theorem 12 of Ch. 1, Theorem 30 of Ch. 
17) states that if R is fixed, Ox R <1, then there exist binary [n, k, d) codes 
with k/n = R and din = H;'(1— R), where H;'(x) is the inverse of the entropy 
function (see $11 of Ch. 10). However except in the case R=0 or R - l, no 
family of codes yet constructed meets this bound. 


Definition. A family of codes over GF(q), for q fixed, is said to be good if 


it contains an infinite sequence of codes €,, @2,..., where €, is an [ni, k, di) 
code, such that both the rate R; = k//n; and d/n; approach a nonzero limit as 
i oc. 


Unfortunately, as we shall now see, primitive BCH codes do not have this 
property —asymptotically they are bad! 


Theorem 13. There does not exist an infinite sequence of primitive BCH codes 
over GF(q) with both din and k[n bounded away from zero. 


Proof. Suppose €,, €2,... is a sequence of such codes, where €; is an 
[n = q'" — 1, k, di] code, and k/n; > B, 0, diini = B; ^0. Let €! be the BCH 
code of length n, and designed distance 6, = [(d; + 1)/q]. By Theorem 6, the 
true minimum distance of €; is < d, hence €12 €, 

Choose A; so that q^*' > 6; > q^, and let €" be the BCH code of length n; 
and designed distance q^. Clearly €7 2 €12 €, so ` 


I(q™ —1,q*)zk. (19) 
Now 


2 2 > 
n qn; qni qn q qni 


Let r; = m; —A,. Since 





r; cannot increase indefinitely, say r; € A,. From Theorem 12, 
I(q" — 1,4") = Y, a? (oi). 
j=0 


Since r; is an integer between 0 and A,, there are only finitely many different 
aj"s. Let A; = max |a|. Similarly let A; = max [pf^| <q. Then 


I(q" —1,q") «(Ai + DAzAS, 
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therefore 
Iu Eq) dedu 
q"-1 
But from (19), m x 
ux Tq" la). k. p oo 
q" -1 Hol T 
a contradiction. Q.E.D. 


Even though long BCH codes are bad, what about other cyclic codes? The 
answer to this important question remains unknown. 


Research Problem (9.2). Are cyclic codes over GF(q) good, for q fixed? 


$6. Decoding BCH codes 


A good deal of work has been done on decoding BCH codes, and efficient 
algorithms now exist. This section contains a simple description of the main 
steps. Much more information will be found in the extensive list of references 
given at the end of the chapter. 

To begin with let € be an [n, k, d] binary BCH code, of odd designed 
distance 5. Suppose the codeword c = c«c,::: c,.; is transmitted and the 
distorted vector y 2 c-e is received, where e= ese: :: @,-; is the error 
vector (Fig. 9.2). The decoding can be divided into 3 stages. (I) Calculation of 
the syndrome. (II) Finding the error locator polynomial o(z). (III) Finding the 
roots of o(z). We shall describe each stage in turn. 


Stage. (I) Calculation of syndrome. The parity check matrix can be taken to be 


la a a 
H = 1 a? at 3(-1) 
la*? gU» ... gon 
CODEWORD CHANNEL RECEIVED VECTOR 
/ \ e 
c = Co *** Cn-4 ysctezyg**-* yn-4 


e = eo: 8p4 
ERROR VECTOR 
Fig. 9.2. 
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As usual let c(x) = X cx, e(x) = E ex', y(x) = E yx'. The syndrome of y is ($4 
of Ch. 1) 


la a? qa" yo 
1 a? a* a X7» yi 
S = Hy? -—————— E E E AE : 
1 a^? = a GO 2-7» yai 
2 yia y(a) Ai 
E ya“ yla’) A3 (20) 
X ya ? y(a*?^?) As-2 


where A, = y(a'). (Note that A; = y(a”) = y(a’)? = A2) The decoder can 
easily calculate the A, from y(x), as follows. Divide y(x) by the minimal 
polynomial M(x) of a', say 


y(x) = Q(x)M(x) + R(x), deg R(x) « deg M(x). 


Then A, = y(a') is equal to R(x) evaluated at x = a‘. The circuitry for this was 
given in $4 of Ch. 3 and $88 of Ch. 7. 

For example, consider the [15,7,5] BCH code of Ch. 3, with 5=S. 
Suppose y(x) = yo+ yix +: +-+ yux" is received. To calculate A,, divide y(x) 
by M(x) = x* - x t 1 (see $4 of Ch. 4) and set x = a in the remainder, giving 
A, = R(a) = uot uia + u;a? ^ usa?. This is carried out by the top half of Fig. 
9.3. To calculate A,, divide y(x) by M°(x) = x+ x? x? x + 1 and obtain the 
remainder R(x) = Ro Rix ++: -+ Rx’. Then 


A57 R(a?)) = Ro+ Ria? + Raat + Ria? 
= Ro* Ria’? + Ria? + a*)+ Rila + a’) 
= Ro+ Ria + Row? + (R, + R+ Raja’? 
vo t via + va’? + va’. 


This calculation is carried out by the bottom half of Fig. 9.3. 
Thus by the end of Stage (D, the decoder has found A,, As,..., As-2. Also 
A= Al, Aa= AX ..., As = Ac are easily found if needed. 


Stage. (II) Finding the error locator polynomial a (z). Suppose e has weight w 
and contains nonzero components &,,..., @,. Then i,...,i, are the coor- 
dinates of y which are in error. As at the end of $1 define the locators 
X, =a‘, r=1,...,w and the error locator polynomial 


ec) - [Ta - xo - > ez. (21) 
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Fig. 9.3. Stage l of decoder: Finding the syndrome. 


Now we have 
A, = y(a') = c(a') + e(a') 
=e(a'), for lsisô-l, 


since c(a') - 0 follows from the definition of a BCH code of designed 
distance 6, thus 


A, = Š X 
i=1 


In Stage (I) the decoder has found the power sums A, A2,..., As... Stage 
(ID, which is much harder than the other stages, is to determine o(z) from A,, 
A2,..., As... The o,’s and A,’s are related by Newton’s identities (in two 
forms, Equation (12) and Problems 52, 54 of Ch. 8). However, as we saw in 
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§3 of Ch. 1, the syndrome, i.e. the A,’s, do not determine e or o(z) uniquely. 
The decoder must find the vector e of lowest weight w, or the lowest degree 
o(z), that will satisfy Newton’s identities. (It is this uncertainty in w which 
makes Stage (ID) so difficult.) 

Several techniques are known for finding o(z). 

To begin with a simple example, consider the double-error-correcting BCH 
code of §3 of Ch. 3. The decoding algorithm given there can be restated as 
follows. 

Stages. (I) Compute the syndrome 


(II) 

(i) If A, = Az = 0, set o(z) =0. 

(ii) If A, #0, A; = Ai, set o(z) =1+4 Aiz. 
(iii) If A, #0, A, # A}, set 


o(z)=1+A,z+ (e. A) 2’. (22) 


(iv) If A, — 0, A; #0, detect that at least three errors occurred. 

Once o(z) is found (in cases (i), (ii), (iii)) the decoder proceeds to Stage 
(III) to find the reciprocals of the roots of o(z), the X,’s, which say which 
locations are in error. 

For a general t-error-correcting BCH code, there are two approaches to 
finding o(z), based on the two versions of Newton’s identities. 


Method. (1) Using Newton’s identities in the form of Equations (14), (16) of 
Chapter 8. We begin with this method because of its simplicity, even though it 
is not practical unless t is small. From Problem 54 of Ch. 8, if w errors 
occurred, the o;'s and A,’s are related by 


1 0 000-0 T: At 

A2 At 1 0 0 . 0 02 As 

Ag As A? A; 1 b 0 O3 = As (23) 
Aw-« Aow-s ora 5S Ag Sy Ow-i A2-3 

Aww-2 A2w-3 tt Áw Ow Aw-t 


For example, for a double-error-correcting code this equation is (if w = 2) 


kak] 


This solution is 
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and so the error locator polynomial is 
o(z)=14+ 012+ 022? 


A 
=1+A,z+ (Z+ an) zi 


in agreement with (22). 
In general, Equations (23) can be solved iteratively using the following 
theorem. 


Theorem 14. (Peterson.) Let 


Ai = > Xi 
i=t 
The v x v matrix 
1 0 0 0 0-...0 
A» A, 1 0 0 e s 0 
M, = Ag As A; A, 15:50 
Aaa Amos tnn Avs 
Asa Ax-3 PET A, 


is nonsingular if w — v orw —v —], and is singular if w « v — 1. 


Proof. (i) Suppose w < v — 1. Then 


from (23), and so M, is singular. 
(ii) Suppose w = v. Then 
det M. = [{ (X; + X». 
For if we put X; = X, det M, = 0 by (i). Therefore det M, = const. IL; (X; + 
X;), and the constant is easily found to be I. 
(iii) Finally, if w = » — 1, M, is nonsingular from (ii). Q.E.D. 


Using this theorem we have the following iterative algorithm for finding 
a(z), for a BCH code of designed distance 6 = 2t+1, assuming w errors 
occur where w s f. 
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Assume ¢ errors occurred, and try to solve Equations (23) with w replaced 
by t. By Theorem 14, if t or t — 1 errors have occurred, a solution exists and 
we go to Stage (III). But if fewer than £— 1 errors occurred, the equations 
have no solution. In this case assume f —2 errors occurred, and again try to 
solve (23) with w now replaced by t —2. Repeat until a solution is found. 

The difficulty with this method is that it requires repeated evaluation of a 
large determinant over GF(2"). For this reason if t is large (e.g. bigger than 3 
or 4) the next method is to be preferred. 


Method. (2) Using the generalized Newton's identities - the Berlekamp al- 
gorithm. Assuming that w errors occurred, the c;'s and A,’s are related by 
Equations (12), (13) of Ch. 8. Equation (12) can be interpreted as saying that 
the A,’s are the output from a linear feedback shift register of w stages, with 
initial contents A,, A2,..., A,-see Fig. 9.4. The register is shown at the 





Fig. 9.4. The A,’s are produced by a shift register. 


instant when it contains A,, A2,..., A. and 
Await = — O1Aw —03À, 1777 7 Ow-1A2— OWA) 


is being formed. 

The decoder’s problem is: given the sequence A,, A2,..., As, find that 
linear feedback shift register of shortest length w which produces A;,..., 
As-: as Output when initially loaded with A,,..., Aw. 

There is an efficient algorithm for finding such a shift register (and hence 
the error locator polynomial o(z)) which is due to Berlekamp. In Chapter 12 
we shall describe a version of this algorithm which applies to a wide class of 
codes, including Goppa codes and other generalization of BCH codes. 

Whichever method is used, at the end of Stage (II) the decoder knows the 
error locator polynomial o(z). 


Stage. (III) Finding the roots of o(z). Since e(z) =H, (1 — X;z), the recipro- 
cals of the zeros of o(z) are X, a^,..., X, =a‘, and errors have occurred 


at coordinates 5,..., bv. 
If o(z) has degree | or 2, the zeros can be found directly (see $7). But in 
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CHANNEL 











Yo¥a * ^ haha 





STAGE I STAGE I STAGE IT 
Fig. 9.5. The complete decoder. 


general the simplest technique is just to test each power of a in turn to see if 
it is a zero of o(z). This part of the decoding is often called the Chien search. 
There is an error in coordinate i iff o (o ') =0. 

Figure 9.5 shows all three stages of the decoder. To illustrate the circuitry 
for Stage (III) we again consider the [15, 7, 5] code. See Fig. 9.6. The first digit 
of y to reach point P is ya, and y, is in error iff o(a )-2 o(a)- 
1*0, 4 0,07^-0. The next digit to arrive is yi, which is in error iff 
ola ")-0o(a')-21-* 00^ +o0.a*=0, and so on. 

The circuit shown in Fig. 9.6 does exactly what is required. Initially the 
three registers are loaded with 1, o, and o» (obtained from Stage II). The 







LOAD INITIALLY 
WITH c, 
MULTIPLIES BY a 


LOAD INITIALLY 
WITH c2 
MULTIPLIES BY a? 











Fig. 9.6. Stage Ill. 
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second register is wired to multiply by a, the third register by a^. After one 
clock cycle the registers contain 1, oia, 02a’, and the output from the OR gate 
at the bottom is 0 iff 1+ cia + oa’? = o(a) = 0. Otherwise the output is 1. The 
NOT gate complements this output, and so a 1 arrives at Q iff y,, is in error. 
Then the adder at P corrects any error in yx. 

One clock cycle later, a 1 reaches Q iff yı is in error, and so on. 

Thus Stage III finds the zeros of o(z), and corrects the corresponding 
errors in y. 


Decoding nonbinary BCH codes. Stage I is much the same as in the binary 
case - the decoder finds A,,..., As-;. In Stage II, Equation (13) of Ch. 8 must 
be used instead of Newton's identities (16). After finding o(z). Equation (19) 
of Ch. 8 may be used to find the error evaluator polynomial w(z). Al- 
ternatively, Berlekamp’s algorithm gives an efficient way of finding o(z) and 
«(z) simultaneously. In Stage (III), when a zero of o(z) is found, indicating 
the presence of an error, Equation (18) of Ch. 8 is used to find the value of the 
error. 


Correcting more than t errors. The decoding algorithms we have described 
only correct t or fewer errors in a BCH code of designed distance 2t + 1. 
Complete decoding algorithms (in the sense of 83 of Ch. 1) are known for all 
double- and some triple-error-correcting codes (see Notes), but the following 
problem remains unsolved. 


Research Problem (9.3). Find a complete decoding algorithm for all BCH 
codes. 


Problem. (5) What is the expression for o(z) analogous to Equation (22), if 
three errors occur in a triple-error-correcting BCH code? 


$7. Quadratic equations over GF(2") 


The last step in decoding a BCH code is to find the roots of the error 
locator polynomial o(z). Over GF(2") quadratic equations can be solved 
almost as easily as linear equations, as we see in this section. The results 
obtained here will also be used in $5 of Ch. 12 when studying double-error- 
correcting Goppa codes. 

In $9 of Ch. 4 it was shown that GF(2") has a basis of the form y, y’. 
y^..... Y", called a normal basis. A typical element 8 of GF(2") can be 
written 

B = boy + biy? + bury +++ + bu ay" (24) 
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b; =0 or l, and the trace of B is (since T,,(y’) = T,n(y) = 1) 
T.(B) = bot bit: +> + bua (25) 


Theorem 15. The quadratic equation x^ + x + B 20, B € GF(2"), has two roots 
in GF") if T,,(B)=0, and has no roots in GF(2"7) if T,(B)— 1. Thus 


x°+x+B=(x+n)(x+€&), for n, £e GFQ"), 
if T.(B) «0, but x' - x + B is irreducible over GF(2") if Ta (B) — 1. 


Proof. Express x in terms of the normal basis as 
X=Xoytuy ter t+ xay". 


' 


X? S= Xn yt Xy te + Xm oy” |. 
If x is a solution to x°+x+ B «0 then (equating coefficients of y^), 
Xo+ Xm-1 = bo, Xi Xo = bi,... , Xm-1 F Xm- = baa. 


Adding these equations gives 
0-5 b-T.(Q) by (25). 


Thus T,,(B) = 0 is a necessary condition for x^-- x + B = 0 to have a solution. 
It is also sufficient, for if T,,(8) — 0, then there are two solutions given by 
Xo= 6, x2 +b, x2 6t bo by . 2. Xm- =F bibo bua, where 8-50 
or 1l. Q.E.D. 


Examples. in GF(2), T,(1) = 1 and x ^ x * l is irreducible. In GF(4), Ta) = 
a+a’=1, and x/* x * a is irreducible. 


Theorem 16. Let B be a fixed element of GF(2") with trace 1. Any irreducible 
quadratic over GF(2") can be transformed into £(x^ 4 x+ B), for some £E 
GF(2"), by an appropriate change of variable. 


Proof. Suppose ax? + bx + c is irreducible, with a # 0, c 0. Furthermore b #0 
or this is a perfect square. Replacing x by bx/a changes this to 


B 


b` ac 
(rex 
a 


b? 
Now T,,(d)=1 by Theorem 15. Therefore T,,(8 + d)- T.(B) * T, (d) = 0. 
Again by Theorem 15 there exists an e in GF(2") such that e^ e + (B + d) - 
0. Replacing x by x +e changes the quadratic to £(x? * x + f). Q.E.D. 


)e exa. 4-5 





Ch. 9. §8. Double-error-correcting BCH codes 279 


Problem. (6) Let B be a fixed element of GF(2") with trace 0. Show that any 
reducible quadratic over GF(2") with distinct roots can be transformed into 
£(? * x * B), for some £ € GF(2"), by an appropriate change of variable. 


88. Double-error-correcting BCH codes are quasi-perfect 


Theorem 17. (Gorenstein, Peterson, and Zierler.) Let € be a double-error- 
correcting binary BCH code of length n = 2" — |. Then 6 is quasi-perfect (see 
85 of Ch. 1), i.e. any coset of € contains a vector of weight x3. 


Proof. If m is odd this result was established in $6 of Ch. 6. 
Suppose m is even. (€^ now has five weights- see Fig. 15.4—so the 
argument used in Ch. 6 fails.) Let the parity check matrix of € be 


lo a? a" ) 
la? ao a) 


i 


If u is any vector of length n, with locators X;,..., X, t= wt(u), the 
syndrome of u is 


Hu? = (4), AX A= DX 
As ist i=l 


By Theorem 5 of Ch. 1 there is a one—one correspondence between cosets and 
syndromes. So we must show that for any syndrome A,, A, there is a 
corresponding vector u of weight <3. In other words, given any A,, A.€ 
GF(2"), we must find X,, X2, X. in GF(2") such that 


X,+ Xt X3= Ai, 
Xi + Xit XP = As. (26) 


Put y = X,+ A, i= 1, 2, 3. Then X., X2, X, exist satisfying (26) iff vi, ys, ys 
exist satisfying 


yit y+ ys=0 
ytyity-s 

where s = Aj * As. Substituting y» = y, * y: into the second equation gives 
yiyalyi + yo) = s. 

We will find a solution with y: #0. Setting y = y,/y2 gives 


yt yt S=0. (27) 


By Theorem 15 we must find a y: such that T,,(s/y:) = 0. If s = 0, (27) has the 
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solutions y — 0 and 1. Suppose s = a“*”, where v = 0, 1, or 2. If » = 0, take 
y2 =a" and T,,(s/y3) = T,,(1)=0 as required. There are n/3 elements with 
v —0. Since n/3<n/2, there is a 0 € GF(2”) with T,,(@)=0 which is not a 
cube, say 6 = a”™*' (the case 0 = a"? is similar). Then 0? = a”? also has trace 
0. Now if for example s = a?**', the choice y; = a** gives T,,(s/y32) = Tn(0) = 
0 as required. .E.D. 


Problem. (7) (Gorenstein, Peterson and Zierler.) (i) Let €, be a BCH code 
over GF(q) of Bose distance d, = 2t, + 1 (see $6 of Ch. 7). Suppose the BCH 
code €; of designed distance 2t, + 2 has Bose distance d;. Show that €, has a 
coset of weight = d.. 

(ii) Hence show that a primitive triple-error-correcting binary BCH code of 
length at least 15 has a coset of weight =5, and so is not quasi-perfect. 


Research Problem (9.4). Show that no other BCH codes are quasi-perfect. 


$9. The Carlitz-Uchiyama bound 


Theorem 18. Suppose € is a binary BCH code of length n «2" —1 with 
designed distance 8 = 2t + 1, where 


2t — 1 «277 4 ], Q8) 
Then the weight w of any nonzero codeword in €°* lies in the range 
27 —(t - 27^ x w <2" t(1—-1)2"7^, (29) 


Ncte that w must be even. 


Proof. The idea of the proof is this. The number of 0’s in a codeword 
a=(do,..., an-ı) € €*, minus the number of 1’s, is equal to È (— 1)*. Using 
the Mattson-Solomon polynomial this is written as an exponential sum (30) 
involving the trace function. Then a deep theorem of Carlitz and Uchiyama is 
invoked to show this sum is small. Therefore the number of 0’s is ap- 
proximately equal to the number of 1’s, and the weight of a is roughly ån. 

The zeros of € are in the cyclotomic cosets Ci, C,,..., Ca. From (28) 
and Corollary 8 these cosets are all distinct and have size m. By $5 of Ch. 7, 
the nonzeros of €^ are C-i, C.s,..., C241. By $6 of Ch. 8 the MS polynomial 
of a € > is ons 


A(z) = 2 Tn(Biz'), Bi E GF(2"), 


i odd 


= (3 aH Taf (2), 
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where f(z) has degree 2t — 1 and f(0) » 0. Now 


> (7 D% = number of O's in a - number of 1’s in a 
i-0 
—n-2wtí(a) 


n-i 
= > (-1^*'*, by Theorem 20 of Ch. 8, 
i-0 


ES > (= 17-0. (30) 


We quote without proof the following: 


Theorem 19. (Carlitz and Uchiyama.) If f(z) is a polynomial over GF(2") of 
degree r such that f(z) £ g(zy + g(z) * b for all polynomials g(z) over GF(2") 
and constants b € GF(2"), then 


(- 1) 74» < (r ES 1)2””. 








BeGFqQ") 


Certainly our f(z) satisfies the hypothesis since it has odd degree. There- 
fore 
< (2t — 2)2"", 





n-l 
—] TSO 4 -1 Tma) 
| (- 1) 2 ) 
[1 n-2 wt(a)| « 2t - 227^, 
which implies (29). Q.E.D. 


Corollary 20. Let € be any binary BCH code of length n = 2" — | and designed 
distance 6 = 2t + 1. Then the minimum distance of €> is at least 


T(t- 2", (31) 


Proof. If 2t — 1 —2'""' 4 | the result follows from Theorem 18. But if 2t — 1 > 
2'"7!4 1, the expression (31) is negative. Q.E.D. 


Examples. When t = 1, €* is a simplex code with all nonzero weights equal to 
2"-'. and in fact (29) states w = 2"''. When t = 2, (29) gives 2" ' - 27^ <ws 
2" 4 2"7. which is consistent with Figs. 15.3 and 15.4. 

As a final example let € be the BCH code of length 127 and designed 
distance 11. From Fig. 4.4 of Ch. 4, €+ has 15 consecutive zeros a'",...,a'”, 
1, and so by the BCH bound (Theorem 8 of Ch. 7), d = 16. However Theorém 
18 gives d > 19, and since d must be even, d = 20. 

In fact the data suggest that a slightly stronger result should hold. 
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Research Problem (9.5). Can the conclusion of Theorem 18 be strengthened to 


2" — (t - 12"? « y <2" (t — 1)2'"79 


A simpler bound on the minimum distance of the dual of a BCH code is 
given by the following 


Theorem 21. (Sikel'nikov.) Let € be a binary BCH code of length 2" — 1 and 
designed distance 2t + 1. Then €` has minimum distance 


d' > 2m-i-ftog, Qt -D1 


Proof. The MS polynomial of a € €' has the form 
AQ)-£à 2 Bz. BeGFQ"). 
i je€C, 
where i runs through the distinct coset representatives among Ci, C,..., 
C3... The binary m-tuple corresponding to any such i has the form 
(0,...,0, 1... ... di, bo), i, 20 or l, 


where rxl-[log;(2t—1)]. This contains a run of at least m—1-— 
[log (2t — 1)] zeros. The binary m-tuple corresponding to j € C; is some cyclic 
shift of the m-tuple corresponding to i, hence 


j < 2" = Qm-i-ltog, Qr - 18 


By Theorem 26 of Ch. 8, since 0 is a zero of A(z), 
wt (a) = 2” — 1 —(deg A(z) - 1) 
z2""-[log; Qt — 1)]. Q.E.D. 


*§10. Some weight distributions are asymptotically normal 


Let € be an [n, k, d] binary code, with weight distribution (Ao, Ai,.--, An), 
where A, is the number of codewords of weight i. Then 
@=(o,4),...,@,), where a; = Aj[?*, (32) 
is a vector with a,-2*, a,7-:--a4,,70,a; 20, and Da; = 1. 


Example. Let € be the [31, 21, 5] double-error-correcting BCH code. From 
Fig. 15.3, €* has weight distribution 


i: 0 12 16 20 
Ai: 1 310 527 186 
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By Theorem 1 of Ch. 5, the weight enumerator of € is 
3s [(x + y)? + 310(x + y)"(x — y)? + 527(x + y)" (x — y)* + 186(x + y) (x — y)”°]. 


Therefore the weight distribution of € is 


i 056 7 8 9 10 11 12 
Aj3l: + 6 26 85 255 610 1342 2760 4600 


13 14 15 16 17 
6300 8100 9741 9741 8100 


The numbers do, . .., 04 are plotted in Fig. 9.7. The reader will immediately be 
struck by how smooth and regular this figure is. The same phenomenon can 
be observed in many codes. A rule-of-thumb which works for many codes is 
that a; is well approximated by the binomial distribution, i.e. by 


a Dis x (^) for j> d. (33) 
.156 
.125 
.094 
gi 
.063 
.031 
(0) 
(0) 10 20 30 
i —— 


Fig. 9.7. The numbers a, = A/2" for the [31,21,5] BCH code. 
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Note that equality holds in (33) if € is the [n, n, 1] code F". Also it is not 
difficult to show that (33) holds for an [n,k] code chosen at random (see 
Problem 8). 

In this section we give, in Theorem 23, some justification for (33) in the 
case that the dual code €^ has large minimum distance. Some knowledge of 
probability theory is assumed. 

So far we haven't said what we mean by the symbol = in (33). To get a 
precise statement we shall define a cumulative distribution function A(z) 


associated with the code. 
Let a = (ao,...,a,) be any real vector with a; z 0 and È a; = 1. The mean 


and variance of a are defined by 
= w(a)= 2, ja;, 


o! - o*(a)- > (u — jF'a; 


The r central moment is 


u(a)=> (ey. r-0,1,.... 
7=0 o 


Thus pola) = 1, ui (a) ^ 0, (a) = 1. 
Of course the binomial vector b = (b, ..., b,). where b, 2 2"(5), plays a 
special róle. For this vector 


u(b)= > i (^) -H, 


j=l J 


Lemma 22. Let € be a code and a be as in Equation (32). If €+ has minimum 


distance d', then 
ufa)=p,(b) forr-0,1,..., d' — 1. 


Proof. Follows from Problem 9e of Ch. 5. Q.E.D. 


Definition. The cumulative distribution function (c.d.f.) A(z) associated with a 
is given by 


AQ YX a, 


j»u-oz 


for any real number z. 
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The classical central limit theorem then states that the cumulative dis- 


tribution function B(z) of b approaches the normal or Gaussian c.d.f. defined 
by 


— l i —t?/2 
P(z) -vez]. dt, 


as n œ. This is a special case of the main theorem of this section: 


Theorem 23 (Sidel'nikov.) Let € be an [n, k, d] binary code, and let d' 23 be 
the minimum distance of the dual code €*. Then 


20 
AG) - €) S ay 


Proof. (Some details are omitted.) Let a be any codeword of €. By (20) of Ch. 
5, o (a) = n/4. Define the characteristic function of a(t) of a by 


a=” pa c (35) 
= Š Ya i [e z -Diy 
= > dj et e (36) 


Equation (35) is the Taylor series of a (t), and we shall need a bound on the 
tail of this series. One first shows (e.g. by induction on r) that for r=1,2,... 
and t » 0, 





it ito (it)! | » a 
1! (r — 1)! r! 





Using (36) this implies that, for r even and any t, 


= av ace) < pla) HL, 


(r-1 





a(£*t)- a(£) - a£): 





and so, putting £ — 0, 


a()- n. (a) CD < um(a) Ht (37) 








Now set r= 2[(d' — 1)/2], the largest even integer less than d'. Then from 
Lemma 22, (37) can be replaced by 


= ub) ler (38) 





a(t) 2, ua) EE 





As starting point for the proof we use a classical formula from probability 
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theory (see for example Feller [427, Equation (3.13) p. 512 and Equation (5.3) 
p. 516]): 


B EN d anle 24 
lAG) - Po) < T JM t dt Tv N (20) e» 
for all T ^ 0. Using (38) the os in (39) is 


[lemen 


2 kr 1 
+20" po ar (40) 





The first of these two integrals is equal to 


l Lf | 
-r |t] 


T a-t?) 
T 2f et [ERAS n | dt. 


—12/2 


cos" tn ^- e dt 








Let 7 — tn +. We estimate ;7^4 log cos r by expanding log(1— (1— còs 7)) in 
powers of 1 —cos r. This gives 


2 
T 
<= cos - 145 








T! 
[z+ log cos 7 














WA — cos 7)“ eT eL, for |r| « 1, 
using 
ii ects l BUE 
2! 2! 4" 
Thus the integrand is 
x E (e7 — p< eee, 
T n 


using e* — 1 € xe* for x =0, 


< 7t e —51?/24 


24n provided T <n’. 


Therefore the first integral in (40) is 


7 Lr 
*——]| te ‘''“d 
um] m : 


. 168 
25mn' 
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The second integral in (40) is 
4u.(b)T’ 


mro ri 


By Problem 9, this is 





4T’ Ge i e^. 


“rrer! 
Collecting these results gives, for all T <n}, 


168 AT'e'* fr? 24 
la - 6G) «5:7. * ari VO) () + Ta Vay 


The choice of 
3 6 Ur 
r-(2) (em) 


now leads to the desired result. Q.E.D. 


From the Carlitz-Uchiyama bound (Theorem 18), and Theorem 21, this 
theorem applies to BCH codes of small designed distance. Other applications 
will be given in the Notes and in the quadratic residue code chapter. 


Corollary 24. A version of the classical central limit theorem. The cumulative 
distribution function B(z) of the binomial distribution b satisfies 


20 
IB(z) - ®(z)| "(ny 
Proof. Let € be the [n, n, 1] code F”. Then d'= n. Q.E.D. 


.. Of course the constant, 20, in Theorem 23 is very poor, for d' must exceed 


400 before Theorem 23 says anything. On the other hand Fig. 9.7 suggests that 
@(z) is a good approximation to A(z) even for short codes. 


Research Problem (9.6). Strengthen Theorem 23. 


Problems. (8) (See also Problem 31 of Ch. 17.) Consider an [n, k] binary code 
€ with generator matrix G = [I | A], where A is ak X (n — k) matrix of 0’s and 
I's (cf. Equation (8) of Ch. 1). Suppose each entry of A is chosen at random, 
to be 0 or 1 with probability ?, and then choose one of the codewords u of € 
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at random. Show that 


Prob {wt (u) 20] 22 *, 


Prob {wt (uw) = j}=2“" {(") — (" i *. j>0. 


Thus we can say that (33) holds for an [n, k] code chosen at random. 
(9) Show that 1, (b) given by Equation (34) satisfies 


A 
ub) = (2) re" for r evenz 2. 


(Hint. Define independent random variables X,,..., Xa, where each X; is + 1 
or —1 with probability 3. X; is approximated by a normal random variable Y, 
with mean 0 and variance 1. If r is even, EX; = 1 < EY;. Then 


m(b) = = E($ x) < E E ($ v).] 


$11. Non-BCH triple-error-correcting codes 


In this final section we describe some triple-error-correcting non-BCH* 
binary codes. These are similar to BCH codes, and have the same parameters, 
but require a different and interesting technique to find d. They will be used in 
$7 of Ch. 15 to construct nonlinear Goethals codes. 

The block length will be n 22"—1, where m=2t+1, t 22. a is a 
primitive element of GF(2”) and M(x) is the minimal polynomial of a‘. 

Triple-error-correcting binary BCH codes normally have g(x)= 
M'"(x M? (xyM"' (x). But the new codes have g(x) = M'(x)M"(x) M^ (x), 
wherer-142'',s- 1+2'. By Problem 14 of Ch. 7, M(x) and M“(x) have 
degree m. The first examples of the new codes are: 


m = 5. A [31, 16, 7] code with g(x) = M°'(x)M(x)M"(x) - this is a triple- 
error-correcting BCH code. 

m = 7. A [127, 106, 7] code with g(x) = M''(x) M" (x) M (x). 

First we consider the [n,n —2m] code B with g(x) = M'"((x)M* x). 

B is a double-error-correcting code. To show this we use linearized 
polynomials. 

Let a(x) 2 x" - x^ ----- x'* be any binary vector of length n and weight 
w, with locators X = {a",..., a'-). Note that 


a(a')= 9 B'-A (41) 
BEX 


*But see Research Problem 9.7. 
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The elements of X may be represented as binary m-tuples. Let V, be the 
subspace of GF(2") generated by these m-tuples. 

Let the rank of a, denoted by r,, be the dimension of this subspace. 
Naturally r, « w with equality iff the elements of X are linearly independent 
over GF(2). 

Each vector of V, represents an element B of GF(2"). We set L(z) 7 
IIpev, (z — B). Since X C V,, 


L(B)20 for BEX. (42) 


It follows from $9 of Ch. 4 that L(z) is a linearized polynomial, i.e. 


L@)=D iz”, EGF”). (43) 


Lemma 25. If a(x) is a codeword of B with rank r, <4, then a(a'*”) = 0 forall 


i z 0. 


Proof. For this a(x) we have 
L(z) = hz + liz? + hz* + lz* + laz", 


where l #0. From (42), L(B) — 0 for all B € X and so 


0- Y BLB)? = Y p? 5 gre 


BEX 
=} ata") by (40). 
i=0 


Rearranging: 


ala" =Y (^ a(a 7) 


Now by definition of 2, 
a(a'* 2 a(a 7) — 0, 
a(a 7) 2 a(a**)""' =0, 
aà(a 77) = a(a 9 "y" =0, 
hence a(a'*” *) 2 0. Furthermore, 
> BL(Q)"-0 fori«Ixt-2, 
Bex 
so we can successively show that 
ala” )-2a(a "7?" ‘)=---=a(a)=0. Q.E.D. 
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Corollary 26. If a(x) is a codeword of B with rank <4, then wt(a(x))27. 


Proof. From the Lemma, ala’) = a(a*) = a(a*) 2 0. Then by the BCH bound. 
wt (a(x)) 27. Q.E.D. 


Theorem 27. B has minimum distance at least 5. 


Proof. For a € B, wt(a) 42» r, «4-2» wt(a) 27 (by Corollary 26), which is 
a contradiction. Q.E.D. 


Problems. (10) Show that for a(x) € B, wt (a(x) =5 >r, =5, i.e. 
a(x) =x" 4-55 x^, 


where «a^,...,«* are linearly independent over GF(2). 

(11) Let a(x) be an even weight vector of the Hamming code generated by 
M(x). Suppose X = (Bi, B2,..., Bre}, X = {Bi + Boo B2+ Boe... Bre-1 + Bre}, 
and let a(x) be the vector defined by X. Show that G(x) is also in the Hamming 
code. 


Finally we establish the minimum distance of our triple-error-correcting 
code. 


Theorem 28. Let € be the [n.n—3m] code with generator polynomial 
M'"(x)M'"(x)M*' (x). Then € has minimum distance 7. 


Proof. Let 54 be the Hamming code generated by M'"(x). Then € = ANB, 
and € has minimum distance z 5. 

Suppose a(x) € € has weight 5. By Problem 10, a(a) #0. So a(x) É %, a 
contradiction. 

Now suppose wt (a(x)) =6, and X = {B,,..., Bo}. Since a(x) € &, i. Bi = 
0. Let X = {B, + Bs, B; Bs,..., Bs + Bo}. and let G(x) be the vector defined by 
X. We show that a(x) € €, a contradiction since wt (à(x)) = 5. By Problem 11, 
a(x) € sf. Also 


à(a 7) = (Bi BJ(Bi Bo)” ++ ++ +(Bs + BBs + Bo)” 
= (Bı + BAB? + Be) +>: + + (Bs + Bo BS + Be) 
=B "+ +B” using = B; =0, 
= a(a'*”)=0. 


=0. Q.E.D. 


Similarly a(a@'** ) = a(a'** ' 
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Research Problem (9.7). Does € have the same weight distribution as the 
triple-error-correcting BCH code? If so, are the codes equivalent? (We 
conjecture they are not.) 


Notes on Chapter 9 


81. The original references to Bose, Ray-Chaudhuri and Hocquenghem were 
given in Ch. 7. The weight distributions of various BCH codes will be found 
in Berlekamp [113, Table 16.1], Goldman et al. [518], Kasami [727] and 
Myrvaagnes [981]. Chien and Frazer [289] apply BCH codes to document 
retrieval; see also Bose et al. [179]. 


$82 and 4. The true minimum distance of BCH codes. A number of results are 
known besides the ones we give here. For example Kasami and Lin [734] 
have shown that d = 8 if n -2"—1 and 6822" 77'—2"77"—] for Isis 
m-s-2andO0sssm -2i. 

Also Peterson [1039] shows that if the BCH code of length 2" — and 
designed distance ô has d = ô, then the code with designed distance ô’ = 
(8 + 1)2"^ — 1, where h = ô, has minimum distance 6’. 

In order to determine the true minimum distance of the codes in Fig. 9.1 we 
used the following sources: Berlekamp [113, Table 16; 116], Chen [266], 
Kasami, Lin and Peterson [737], Peterson [1038, 1039], and Theorems 2, 3 and 
5. Other results on the exact determination of d will be found in Chen and Lin 
[274], Hartmann et al. [612], Knee and Goldman [770], Lin and Weldon [839], 
and Wolf [1428]. Theorem 2 is due to Farr [417] and Theorems 3 and 4 to 
Peterson [1038, 1039]: See also [932]. 

We would appreciate hearing from anyone who can remove any of the 
asterisks from Fig. 9.1. The entry marked with # was found by Kasami and 
Tokura [744], who give an infinite family of primitive BCH codes with d > ô. 
Cerveira [255] and Lum and Chien [864] give nonprimitive BCH codes with 
d 6. 

Stenbit [1276] gives generator polynomials for all the codes in Fig. 9.1. 


$83. This section follows Mann [906]. (There is an unfortunate misprint in the 
abstract of [906]. In line 3, for i=1,..., v read i — l,..., v — 1.) Berlekamp 
[112; 113, Ch. 12] gives a general solution to the problem of finding I(n, ô). 
See also Peterson [1036]. 


$4. For Problem 4 see Coxeter and Moser [314, p. 6]. 


$5. Theorem 13 was discovered by Lin and Weldon [838] and Camion [237]. 
Berlekamp [122] has proved: 
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Theorem 29. For any sequence of primitive binary BCH codes of rate R, 


log. R`' 
log; n 





d —2n , as length n o, 
where d may be interpreted as the designed distance, Bose distance, or true 
minimum distance. 


Unfortunately the proof is too long to give here. 


Kasami [728] has given the following generalization of Theorem 13. Any 
family of cyclic codes is bad if it has the property that the extended codes are 
invariant under the affine group. By Theorem 16 of Ch. 8 this includes BCH 
codes. On the other hand McEliece [938] has shown that there exist (possibly 
nonlinear) codes which are invariant under large groups and meet the Gilbert- 
Varshamov bound. See also Theorem 31 of Ch. 17. 


$6. Decoding. The main source is Berlekamp [113]. Excellent descriptions are 
also given by Peterson and Weldon [1040], and by Gallager [464]. Chien [282] 
contains a good survey. 

Special decoding algorithms for certain BCH codes have been given by 
Banerji [67], Bartee and Schneider [71], Blokh [164], Cowles and Davida 
[312], Davida [333], Kasami [725], Matveeva [929,930], and Polkinghorn 
[1065]. 

Theorem 14 is from Peterson [1036]. For the Berlekamp algorithm see the 
original reference [113, Ch. 7], Massey [922, 922a], Gallager [464], Peterson 
and Weldon [1040, Ch. 9], and also Ch. 12. Refinements will be found in 
Burton [216], Ong [1014], Sullivan [1293], and Tzeng et al. [1352]. Bajoga and 
Walbesser [59] study the complexity of this algorithm. 

For the Chien search see Chien [279]. Gore [543,547] and Mandelbaum 
[896, 899] describe techniques for speeding up or avoiding the Chien search. 
The Chien search can be avoided if the zeros of a(z) can be found directly, so 
the following references on factoring polynomials over finite fields are rele- 
vant: Berlekamp [111, 113, 117, 121], Berlekamp et al. [131], Chien et al. [288], 
Golomb [523], McEliece [937] and Mills and Zierler [961]. Other references on 
decoding BCH codes are: Berlekamp [110], Chien and Tzeng [293], Davida 
[334], Forney [435], Laws [795], Massey [920], Michelson [954], Nesenbergs 
[985], Szwaja [1298], Tanaka et al. [1301, 1304], Ullman [1357], and Wolf 
[1426]. 


Correcting more than t errors. The complete decoding of double-error-cor- 
recting BCH codes is given by Berlekamp [113, $16.481] and Hartmann [607]. 
Complete decoding of some (perhaps all) triple-error-correcting BCH codes is 
given by Van der Horst and Berger [663]. Their algorithm applies to all 
triple-error-correcting BCH codes if the following problem is settled. 
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Research Problem (9.8). Show that the maximum weight of a coset leader of 
any coset of a triple-error-correcting BCH code is 5. (This is known to be true 
except possibly for m = 12 and m =2 (mod 4).) 


Berger and Van der Horst [106] use their decoding algorithm to apply 
triple-error-correcting codes to source coding. 

Hartmann [609], Hartmann and Tzeng [619], Reddy [1096], and Tzeng and 
Hartmann [1351] give decoding algorithms for correcting a little beyond the 
BCH bound in certain cases. 


§8. The results of this section are from Gorenstein, Peterson, and Zierler 
[550]. See also Berlekamp [113, §16.481] and [663]. Leont’ev [816] has par- 
tially solved Research Problem 9.4 by showing that a binary BCH code of 
length n —2" —1 and designed distance 5=2t+1 is not quasi-perfect if 
2« t « nillogn and m 2 7. 


$9. Theorem 19 is from Carlitz and Uchiyama [249], and depends on a deep 
theorem of Weil [1394]. See also Williams [1419]. Anderson [28] was the first 
to use Theorem 19 in coding theory; however, our result is slightly stronger 
than his. 


Research Problem (9.9) Can any other bound from number theory, for 
example those of Lang and Weil [793] or Deligne [342], be used to obtain 
bounds on the minimum distance of codes? 

Theorem 21 is from Sidel'nikov [1208]. 


$10. Theorem 23 is also due to Sidel'nikov [1208] (except that he has 9 where 
we have 20). It has been generalized by Delsarte [360]. The proof is modeled 
on Feller's proof of the Berry-Esséen central limit theorem given in [427, Ch. 
16, 85]. Sidel'nikov has also proved that for many BCH codes 


Aj = E () (14€), where |e] « Cn ^"^, 


but the proof is too complicated to give here. 
Combining Theorems 21, 23 and 29 we deduce that the weight distributions 
of primitive binary BCH codes of fixed rate are asymptotically normal. 


Reed-Solomon and Justesen 
codes 


§1. Introduction 


The first part of the chapter deals with Reed-Solomon codes, which are 
BCH codes over GF(q) with the special property that the length n is q — l. 
Besides serving as illuminating examples of BCH codes, they are of con- 
siderable practical and theoretical importance, as we shall see. They are 
convenient for building up other codes, either alone (for example by mapping 
into binary codes, $5) or in combination with other codes, as in concatenated 
codes ($11). 

Justesen codes ($11) are a family of concatenated codes which can be 
obtained in this way. Justesen codes are distinguished by being the first family 
of codes we have seen with the property that, as the length increases, both the 
rate and distance/length remain positive. Thus, unlike BCH codes (see $5 of 
Ch. 9), asymptotically Justesen codes are good. 


82. Reed-Solomon codes 


Definition. A Reed-Solomon (or RS) code over GF(q) is a BCH code of 
length N — q— 1. Of course q is never 2. Thus the length is the number of 
nonzero elements in the ground field. We shall use N, K and D to denote the 
length, dimension, and minimum distance (using capital letters to distinguish 
them from the parameters of the binary codes which will be constructed 
later). Figure 10.6 gives a summary of the properties of these codes. 
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Since x^'— 1 =I gecrax (x — B), the minimal polynomial of a‘ is simply 
M(x) = x — a'. Therefore an RS code of length q — 1 and designed distance ô 
has generator polynomial 


Pty eg 


g(x)7 (x-a^)(x-a ox matt, (1) 


Usually, but not always, b = 1. 


Examples. (1) As usual take GF(4) = (0, 1, o, B = o?) with a?+a+1=0. An 
RS code over GF(4) with N =3, designed distance 2 and b =2 has g(x)= 
x — B. The 4? codewords are shown in Fig. 10.1. 


000 lad BO0a Bal 
Ola apd 108 lll 
Oaß B10 lfa aaa 


081 all alps BBB. 
Fig. 10.t. A (3, 2. 2] RS code over GF(4). 


(2) The RS code over GF(5) with N = 4 and designed distance 3. We take 
a —2 as the primitive element of GF(5), so that 


a(x) » (x - ax — a?) = (x -2)x-4) = xD? -4àx 4 3. 


Some of the 25 codewords are 3410, 2140, 1320, 0341, 1111,.... 

The dimension of an RS code is K=N-—degg(x)=N-65+1. The 
minimum distance D is, by the BCH bound (Theorem 8 of Ch. 7), at least 
ô= N — K + 1. However, by Theorem 11 of Ch. 1 it can't be greater than this. 
Therefore 

D=N-K+#+1, 
and RS codes are maximum distance separable (see §10 of Ch. 1, and the next 
chapter). It follows that the Hamming weight distribution of any RS code is 
given by Theorem 6 of Ch. 11. 

RS codes are important for several reasons: 

(i) They are the natural codes to use when a code is required of length less 
than the size of the field. For, being MDS, they have the highest possible 
minimum distance. 

(ii) They are convenient for building other codes, as we shall see. For 
example they can be mapped into binary codes with surprisingly high 
minimum distance (§5). They are also used in constructing concatenated and 
Justesen codes (§11). 

(iii) They are useful for correcting bursts of errors ($6). 

Encoding and decoding are discussed in §7 and §10. 


Problem. (1) Show that the dual of an RS code is an RS code. 
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$83. Extended RS codes 


Adding an overall parity check to a code does not always increase the 
minimum distance, as Problem 42 of Ch. 1 showed. However: 


Theorem 1. Let € be the [N -q"—L,K,D] RS code with generator 
polynomial 


g(x) = (x - ax - a?)::-(x- a"). (2) 
Then extending each codeword c = CoC,: ++ cu. of € by adding an overall 
parity check 
N-I 
Cn — — Ci 


produces an [N +1, K, D+ 1] code. 


Proof. Suppose c has weight D. The minimum weight is increased to D+ 1 
provided 


N-1 
c(l)=— cr = > c, # 0. 


But c(x)=a(x)g(x) for some a(x), so c(l) - a(l)g(1). Certainly g(1) 40. 
Furthermore a(1) #0, or else c(x) is a multiple of (x — Dg(x) and has weight 
>D +1 by the BCH bound. Q.E.D. 


Example. The preceding example gives the [4, 2, 3] code shown in Fig. 10.2. In 
fact a further extension is always possible - see $5 of Ch. 11. 


0000 1e08 B0al Bald 
Olap aß0l 10go lll 
Oaßl fBlO0a 18oa0 aaaa 
OBla «018 ao«1BO0 BBBB. 
Fig. 10.2. A (4,2, 3] extended RS code. 


$4. Idempotents of RS codes 


In this section we assume q = 2". Minimal RS codes have dimension | 
(hence D= N) and are easy to describe. The check polynomial is h(x)= 
X +a‘, for some i20,1,..., N — I, and g(x) » (x + 1)/h(x). There areq-1 
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nonzero codewords, all of weight N, and they consist of all scalar multiples of 
one codeword. 
Suppose 


E(x) = S Bx' 


1-0 


is the corresponding primitive idempotent, with Bo=1. Since the code is 
cyclic, 


BixE(x) = BiBv-1+ Bix + » B, Bx"! 


is also in the code, and must be equal to E(x) (or else E(x) — BixE(x) would 
have weight less than N). Therefore B. = 87. Repeating this argument shows 
that 


E(x) = 1+ Bx + (Bxy + (Bx) +--+ - (Bx) (3) 
for some f. 
For example, when N = 7, the primitive idempotents are shown in Ffy. 10.3 
p 
(taking B = 1,a,...,a@°), together with the corresponding check polynomials. 
E(x) h(x) 
0.7194 x +x? +x? +x +x -4x* x+1 
OQ=1+ax +x? +a’ x +a x - ax + a^x* Xt a^ 
0,— 1 a?x - a^ x? - a^x! - ax* +a x^ acx^ xta 
0,— 1 a?x * a^x/? - a) x? +a x t+ ax? +a x" xt a* 
0,— 1 a!x - ax? * a) x? * a?x* * a*x^ - a?x* xta? 
0.—1*- ax ax! * ax! +a x +a x t ac x" xta! 
0,— 1 - a^x +a x? +a x? + a xt + a? x^ ax^ x+a 


Fig. 10.3. Idempotents of minimal [7.1.7] RS codes over GF(8). 


Then the idempotent of the [2"—1,K,2”—K] RS code (with b = 1) is 
=, 6. (This is somewhat easier to find than the generator polynomial 
Equation (1), although both require a table of GF(2").) 

Problem. (2) Show that the idempotent of the [7,2,6] RS code (with b = 1) is 
E(x) 7» a^x + ax! + a^ x? +a’ x * a? x! * ax’. 


Show that the codewords consist of cyclic permutations of the 9 codewords 
0,, 62, a! E(x), 0x i x6. 
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$5. Mapping GF(2”) codes into binary codes 


We know from Ch. 4 that elements of GF(q), where q=p”™, can be 
represented by m-tuples of elements from GF(p). Therefore an [N, K, D] RS 
code over GF(q) becomes an [n = mN, k = mK, d= D] code over GF(p). If 
q = 2" the binary codes obtained in this way (and others derived from them) 
often have high minimum distance, as we now see. 

Let é,,...,&, be a basis for GF(2”) over GF(2). Then if B = Ez, b£, is any 
element of GF(2"), b; € GF(2), we map B into b,, b2,...,b,,. This mapping 
sends linear codes into linear codes (but cyclic codes need not go into cyclic 
codes). 


Examples. (1) Using the basis 1, a for GF(4) over GF(2), 0 maps into 00, ! into 
10, œ into 01, œ? into 11. Then the [3,2,2] RS code over GF(4) of Fig. 10.1 
becomes the [6, 4, 2] binary code of Fig. 10.4. 


000000 100100 110001 110110 
001001 011100 100011 101010 
000111 111000 101101 010101 
001110 010010 011011 111111. 

Fig. 10.4. A [6, 4, 2] binary code obtained from Fig. 10.1. 


(2) Let c = (Co, €,..., Cn-1) belong to an [N, K., D] RS code over GF(2"). 
Replace each c, by the corresponding binary m-tuple, and add an overall 
parity check on each m-tuple. The resulting binary code has parameters 


n=(m+1\(2"-1), k 7» mK, dz2D =2(2" — K), (4) 
for any K = 1,...,2" —2. The same construction applied to the extended RS 
code gives 

[(m + 1)2", mK, d 222" — K + 1)] (5) 

binary codes, for K = 1,...,2" — 1. 
E.g. From the [15,10,6] and [16,10,7] codes over GF(2*) we obtain 
[75, 40, 17] and [80, 40, 14] binary codes. Even though slightly better codes 
exist- we shall construct an [80, 40, 16] quadratic residue code in Ch. 16- 


nevertheless this construction is impressively simple. (See also $8.1 of Ch. 
18.) 
(3) Using the basis 1, a, a^ for GF(2*) over GF(2), the mapping is 
0 ^ 000, a? 101, a^ 011, 
1 — 100, a* 110, a^ 5 001. 
a — 010, a^ 111, 
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Consider the (7, 5, 3) RS code over GF(2)) with generator polynomial 


gi(X) = (x + a?)(x + a*) 


—-a**ox-tx. 


It is surprising that this is mapped onto the [21, 15, 3] binary BCH code with 
generator polynomial 


g()-7 M'(y)oS br yty t y'* y". 
For g,(x) itself is mapped onto the vector 


111, 010, 100, 000, 000, 000, 000 


which is g:(y). Also ag,(x) is mapped onto yg:( y), @7g.(x) onto y’g2(y), xgi(x) 
onto y'g;(y), and so on. 

This is the only known, nontrivial, example of a cyclic code mapping in this 
way onto a cyclic code! 


Problem. (3) Let 9 be a linear mapping from GF(2") onto GF(2)", and 9 be 
the induced mapping from vectors of length 2" — 1 over GF(2") onto binary 
vectors of length m(2" — 1). Suppose €, is a cyclic code over GF(2") with 
generator polynomial g,(x), and €; is a binary cyclic code with generator 
polynomial g;( y). Show that ¢ maps €, onto €. iff y'g«(y) is the image under 9 
of some scalar multiple of g,(x), for all Ox ij « m — 1. 


Research Problem (10.1). Find some more examples of cyclic codes mapping 
onto cyclic codes: 


The effect of changing the basis. A change of basis may change the weight 
distribution and even the minimum weight. For example, consider the (7, 2, 6] 
MDS code with idempotent @,+ 0, and check polynomial (x + a)(x + a^). 
(This is not an RS code.) The codewords (over GF(2)) are cyclic per- 
mutations of the 9 vectors in Fig. 10.5. 

Using the basis 1, a, a? we obtain the binary weights in column 8. Using the 
basis œ’, a^, a? we have the mapping 


0 000, a! 101, a^ — 001, 
1111, a! — 100, a^ — 010, 


a 5011, a* — 110, 


giving the binary weights in column 9. Notice the minimum weight obtained 
from the first mapping is 8, but from the second only 6. 
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Codeword Basis #1 Basis #2 
ben ge ux xe ae dx Weight Weight 
| a oa? a? a^ a! a^ 12 12 
1 af a a^ d ad a 12 12 
0 a^a! a^ a^ o? a 14 6 
0 afal l 1 aœ a^ 10 12 
0 1 o aa «a 1 10 12 
0 a a dà a a a 8 10 
0 @ 1 «a oa | @& 8 12 
0 aa at aa ca 10 10 
0 at a’? a^ a a? a 12 10 


Fig. 10.5. A cyclic code and the weights of two binary codes obtained from it. 


Problem. (4) Show that the weight distribution of the first binary code is: 
Ao= l, Ag = 14, Aw = Apn= 21, Au= 7. For the second: Ao = l, Ab=7, Aw= 
21, An = 35. 


Research Problem (10.2). Given a code € over GF(2"), which basis for GF(2") 
over GF(2) maps € into the binary code €* with the greatest minimum 
distance? How much effect does this have on the minimum distance of €*? 
Does it help to use nonlinear mappings? 


RS codes contain BCH codes. The codes in Example 2 are so good that it is 
worth examining their performance for large m. This deteriorates because of 


D-I 


Theorem 2. The [N -2"— 1, K, D] RS code with zeros a, a°,..., a con- 
tains the primitive binary BCH code of length N and designed distance D. 
Similarly the extended RS code contains the extended BCH code. 


Proof. If c belongs to the BCH code, c is a binary vector with c(a) 7 :::— 
c(aP?^') 2 0, and so also belongs to the RS code. Q.E.D. 


Therefore the minimum distance of the RS code is at most the minimum 
distance of the BCH code. From this we can show that, just as long BCH 
codes are bad ($5 of Ch. 9), so are these long binary codes obtained from RS 
codes. 


Theorem 3. The binary codes obtained from RS codes and having parameters 
given in (4) and (5) are asymptotically bad. That is, they do not contain an 
infinite family of codes with both rate and distancellength bounded away from 
zero. 
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Proof. Let € be an [N -2"— l, K. D= N—-K +1] RS code. From Theorem 
2, € contains the binary BCH code €, of length N and designed distance D. 
Define d; by 


27'«Dxdi-2-1, 


so that €, contains the BCH code €; of designed distance d;. By Theorem 5 
of Ch. 9, d, is the true minimum distance of 6. 

Now let €, be the [(m + 1)N, mK, d;] binary code obtained from €, as in 
(4). (The proof for (5) is similar.) Then 


d, « 2d; « 4D. 
Therefore if the rate = mK/(m + 1)N = KIN of €, is held fixed, the ratio 
distance _ d, 4D 4(N —-K +1) 





length (m+DN (m+DN — (m* ON 


approaches zero as m >œ. Q.E.D. 


However, by using only a slightly more complicated construction, it is 
possible to get asymptotically good binary codes from RS codes - see $11. 


$6. Burst error correction 


On many channels the errors are not random but tend to occur in clusters, 
or bursts. 


Definition. A burst of length b is a vector whose only nonzeros are among b 
successive components, the first and last of which are nonzero. 

Binary codes obtained from RS codes are particularly suited to correcting 
several bursts. For a binary burst of length b can affect at most r adjacent 
symbols from GF(2"), where r is given by 


(r-2)m42s b (r— l)m 4 l. 


So if D is much greater than r, many bursts can be corrected. 


$7. Encoding Reed-Solomon codes 


Since RS codes are cyclic, they can be encoded by either of the methods 
described in $8 of Ch. 7. However, the following simple encoding method 
(which is in fact the original method of Reed and Solomon) has certain 
practical advantages. 
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Let u = (uo. ui... uk i). u; € GF(q), be the message symbols to be en- 
coded, and let xci 
uiz)- > uz. 
eu 
Then the codeword corresponding to u is taken to be the vector c whose 
Mattson-Solomon polynomial is Nu(z), where N = q-—1. Thus 


c =(u(1), u(a),..., u(a™ ')). (6) 


[Or (u(0), u(!), u(a),...,u(a™ ')) for the extended code with an overall 
parity check added.) We show that c is in the Reed-Solomon code by 
verifying that NI 

c(x) = 2 cx' 
hasa,a’,.... a’ ' as zeros. In fact the MS polynomial of c, Nu(z), is equal to 


N I 
> A iz’, where A , 7 c(a j), 
j-9 
by Equation (8) of Ch. 8. Therefore, equating coefficients, and using N = —1 


in GF(q), we have 
c(l) = — uo, Cla ') 2—u,...,c(a ") 2—uka (7) 


and c(a’)=0 for 1sj<N- K =D-1. Thus c is in the RS code, and this is 
an encoding method for this code. Notice however that this encoder is not 
systematic. 


Definition. Suppose € is an (n, M = q'. d) linear or nonlinear code over 


GF(q), with an encoder which maps a message u«,..., ux 1 onto the codeword 
€Co,..., Cn- The encoder is called systematic if there are coordinates 
jogs ote a SUCH that uo €,,..., H, 17 C, o Le. if the message is to be found 


unchanged in the codeword. 
For example. the encoder 





message codeword 
00 000 
01 010 
10 101 


11 111 
is systematic (with i, — 0, i, — 1), while the same code with the encoder 


message codeword 
00 000 
01 111 
10 101 
B 010 
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is not. An unsystematic code scrambles the message. Even after the error 
vector and codeword have been recovered, in an unsystematic code a further 
computation is necessary to recover the message. In the present example 
Equation (7) tells how to recover the message from the codeword. 


Problem. (5) Show that any linear code has a systematic encoder. What about 
the nonlinear codes (000, 110,011, 111} and (000, 100, 010, 001}? 


For example, consider the [4,2,3] RS code over GF(5) described in 
Example 2 of $2. Encoder #1 of 88 of Ch. 7 would map the message (uo, ui) 
onto the codeword (2uo + 2u,, uo + 3ui, uo, Wy). On the other hand the encoder 
just described maps (uo, u;) onto the codeword 


€ = (Uo + ui, Uo + 204, Uo + Au, Uo t 30). 


Problem. (6) Verify the last two statements. 





An RS code of length N =q — 1 over GF(q) is a cyclic code with 
generator polynomial  g(x)- (x—a*YXx—a**)---(x-a 
where a is a primitive element of GF(q). The dimension is K — 
N —6- l| and the minimum distance is 6. (Often b = 1.) This is a 
BCH code, and is MDS (Ch. 11). It may be extended to [q+ 


broca) 
, 


1, K, q - K +2] and (if q —2") [2" * 2, 3, 27] and [2" * 2,2" — 1, 4] 
codes ($5 of Ch. 11). The idempotent is given in $4. RS codes are 
important in concatenated codes ($11) and burst correction ($6). 





Fig. 10.6. Properties of Reed-Solomon codes. 


*$8. Generalized Reed-Solomon codes 


A slightly more general class of codes than RS codes are obtained if 
Equation (6) is replaced by 


c= (viu (1), v3u(a), EE vxu(a" )), (8) 


where the v; are nonzero elements of GF(q). Equation (6) is the case when all 
v; = l. This suggests a further generalization. 


Definition. Let a = (a,,..., æn) where the a; are distinct elements of GF(q”), 
and let v = (v,,..., vv) where the v; are nonzero (but not necessarily distinct) 
elements of GF(q"). Then the generalized RS code, denoted by GRSx(a, v), 
consists of all vectors 


(vi F(a), v;F(a:;), E" vnF (ayn )) (9) 
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where F(z) ranges over all polynomials of degree < K with coefficients from GF(q”). 
This is an [N, K] code over GF(q"). Since F has at most K — 1 zeros, the 
minimum distance is at least N — K + 1, and hence is equal to N — K + 1. The 
code is MDS. 


Theorem 4. The dual of GRSx(a, v) is GRSn-x(&, v') for some v'. 


Proof. First suppose K = N — 1, and let Ó9 be the dual code to GRS,. (a, v). 
Then £ has dimension | and consists of all scalar multiples of some fixed 
vector v' (vi... , 0x). We must show that all vx 0. v' satisfies 


vui Db vyvn = 0, 


GiU,Ui t occ + anvyvn = 0, 
gd 4 
ai mvt: +QN VNU N = 0, 
or 
1 1 vvi 
a; iy 
| [20 (10) 

ar -- au | | UNUN 


If any v; ^ 0 then (10) gives a set of simultaneous equations for the other vw: 
whose coefficient matrix is Vandermonde. Hence from Lemma 17 of Ch. 4 all 
vvi — 0 and so all v: =0, which is impossible. 

Then GRS,(a, v) is dual to GRS,. «(o, v^), for all K < N — 1, since 


> (aiv))(aivi) = Y aj'"wv,-0 
i=l i-l 


fors<=K—-—1,t<N-—K - 1, from (10). Q.E.D. 


It follows from Theorem 4 that GRSx(a, v) has parity check matrix equal 
to a generator matrix for GRS\_x(a, v"), which is 


Ui © Ùn 
, D 

Qi ONUN 

an Kp aN Kv 
l à vi 

,9 

DA -> QN v; 
E 0 
aves! aN oe vy 


We shall meet these codes again in §2 of Ch. 12. 
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Problem. (7) (a) If B, = ca;+d(i=1,.... N), for c# 0, d € GF(q"), show that 
GRS x(a, v) = GRS, (B, v). 

(b) When are GRSx(a, v) and GRSx(a, w) equivalent? 

(c) Show that GRS,(a, v) = GRS&(a, w) iff v = àw, A#¥ 0E GF(q"). 


$9. Redundant residue codes 


Equation (6) also suggests another way of looking at RS codes. Observe 
that u(a@‘) is the remainder when the message u(x), of degree less than K, is 
divided by M“(x) = x — a‘. So we can restate Equation (6) by saying that u(x) 
is encoded into 


(ro. Pisses Fu i), (12) 


where r; is the residue of u(x) modulo M®'(x). 
u(x) can be recovered from its residues with the aid of: 


Theorem 5. (Chinese remainder theorem for polynomials.) Let 
méx),..., mx. (x) be polynomials over GF(q) which are pairwise relatively 
prime, and set M(x) » mY(x)m,(x)--- mk (x). If r«x)..... rk (x) are any 
polynomials over GF(q), there exists exactly one polynomial u(x) with 
deg u(x) < deg M(x) such that 


u(x) = n(x) (mod m,(x)), (13) 
for alli=0,..., K — 1. In fact, let a(x) be such that 


M(x) 
m,(x) 





a,(x) = 1 (mod m(x)), i=0,..., K-41. 


(Such an a(x) exists by Corollary 15 of Ch. 12.) Then the solution to (13) is 


u(x) = 3, MOO (yar) reduced mod M(x). (14) 


Problem (8). State the corresponding theorem for integers. 


Theorem 5 shows that ro,..., rk -ı in (12) are enough to reconstruct u(x), in 
the absence of noise. Thus r«(x),..., rs (x) are redundant residues which are 
included in the codeword for protection against errors. Any code of this type 
is called a redundant residue code. We have shown that RS codes are 
redundant residue codes, and in Ch. 12 we shall see that some Goppa codes 
are also of this type. Other examples are mentioned in the Notes. 
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§10. Decoding RS codes 


Since RS codes are special cases of BCH codes, they can be decoded by 
the methods of $6 of Ch. 9. The original majority logic decoding method of 
Reed and Solomon is also worth mentioning because of its considerable 
theoretical interest, even though it is impractical. 

Suppose the codeword (6) has been transmitted, an error vector e= 


(@,...,@n-1) occurs, and y=(yo,..., yn_-1) is received. Thus the decoder 
knows 
Vo = ut Hot uit uot + uxo, 
vi =e, tuot au, ajusta tug i, (15) 
yN-i = ey. at lot a" "ay + a Pu, + S aK (DN RUPES 


If there are no errors, e = 0, and any K of these equations can be solved to 
determine the message u = (uo, ..., uk), since the coefficient matrix is Van- 
dermonde (Lemma 17 of Ch. 4). Thus there are (X) determinations, or votes, 
for the correct u. 

If there are errors, some sets of K equations will give the wrong u. But no 
incorrect 4 can receive too many votes. 


w-K-I 


Theorem 6. If w errors occur, an incorrect u will receive at most ("*& ') votes. 
The correct u will receive at least (“R”) votes. 


Proof. Since the equations in (15) are independent, any K of them have 
exactly one solution 4. To obtain more than one vote, u must be the solution 
of more than K equations. An incorrect u can be the solution of at most 
w + K —1 equations, consisting of w erroneous equations and K — 1 correct 
ones. (For if the u is the solution of K correct equations then 4 is correct.) 
Thus an incorrect u can be the solution to at most ("*£ !) sets of K equations. 
Clearly there are (“k”) sets of correct equations giving the correct u. Q.E.D. 


Thus the message u will be obtained correctly if (~k) > ("££ ), ie. if 
N-w>w+t+kK-1, or D=N-—K+1>2w. So error vectors of weight less 
than half the minimum distance (and possibly others) can be corrected. Of 
course if (X) is large this method is impractical. 


$11. Justesen codes and concatenated codes 


In $5 of Ch. 9 we saw that long BCH codes are bad, and in $5 of this 
chapter that long binary codes obtained from RS codes are also bad. 
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However, by a very simple construction it is possible to obtain an infinite 
family of good binary codes (called Justesen codes) from RS codes, as we 
now show. 

The starting point is an RS code R over GF(2") with parameters [N = 2" — 
1,K,D=N-K+1). Let «a be a primitive element of GF(2"). Let a= 
(do, ..., axi), a; € GF(2"), be a typical codeword of R. Let b be the vector 


b = (do, 4o; 41, 401,45, 0705; ... 1 d, € du i). 


Finally, replacing each component of b by the corresponding binary m-tuple, 
we obtain a binary vector c of length 2mN. 


Definition. For any N and K, with 0< K < N, the Justesen code f. consists 
of all such vectors c which are obtained from the [N, K] Reed-Solomon code 


R. 


Clearly nx is a binary linear code of length n=2mN and dimension 
k = mK. The rate of this code is k/n = K/2N <3. A larger class of Justesen 
codes which includes codes of rate =} will be described below. 

Justesen codes are an example of concatenated codes, which we now 
describe. 


Concatenated codes. Consider the arrangement shown in Fig. 10.7. Suppose 
the inner encoder and decoder use a code (called the inner code) which is an 
[n, k, d] binary, code. The combination of inner encoder, channel, and inner 
decoder can be thought of as forming a new channel (called a superchannel) 
which transmits binary k-tuples. If these k-tuples are considered as elements 
of GF(2*), we can attempt to correct errors on the superchannel by taking the 
outer code to be an [N, K, D] code over GF(2*). Frequently a Reed-Solomon 
code is used as the outer code. This combination of codes (or any similar 
scheme) is called a concatenated code. The overall code (the supercode) is a 
binary code of length nN, dimension kK, and rate (k/n) - (K/N). 

The encoding is done as follows. The kK binary information symbols are 
divided into K k-tuples, which are thought of as elements of GF(2*). These 


are then encoded by the outer encoder into the codeword aoa, : * an-ı. Each 
a; is now encoded by the inner encoder into a binary n-tuple b. Then 
bob, +++ bu. is the codeword of the supercode and is transmitted over the 
channel. 
OUTER INNER INNER OUTER 
Nice eee ee CUP l 
SUPERCHANNEL 


Fig. 10.7. A concatenated code. 
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Protlem. (9) (a) Show that the minimum distance of the concatenated code is 
at least dD. 

(b) For example, use an [8, 4, 4] binary inner code and a [12, 6,7] outer RS 
code over GF(2*) to obtain a [96, 24, 28] binary concatenated code. Obtain a 
[104, 24, 32] code in the same way. 

Justesen codes may be thought of as concatenated codes where the inner 
encoder uses N distinct codes. Let €; be the [2m, m] binary code consisting 
of the binary representations of the vectors (u, a‘u), u € GF(2"). Then the i^ 
symbol a; of the outer code is encoded by €, 


Minimum distance of Justesen codes. The key point is that every binary 
vector (u, v) of length 2m with u# 0, v # 0 belongs to exactly one of the codes 
€. Therefore a typical codeword c of ¥ contains at least D distinct binary 


(2m)-tuples. 
Since there aren't many (2m)-tuples of low weight, the total weight of c 
must be large. Thus if | is as large as possible subject to 


i 
` em <D, 
i-l I 
then if D is large so is l, and 

L (2m 

wt(c)> Yi i 

i-l 
is also large. 
Estimates of binomial coefficients. Before proving the first theorem we need 


some estimates of binomial coefficients. These involve the entropy function 
Hi(x), defined by 


Hx) = —x log, x - (1 — x) log. (1 — x) 
where 0 x <1 (Fig. 10.8). Fano [415] gives a useful table of H;(x). H2(x) is 


Hol x) 1/2 


% 1/2 ! 


x 


Fig. 10.8. The entropy function H;(x). 
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[6] 1/2 1 
y 


Fig. 10.9. The inverse function H, '(y). 


used in information theory as a measure of "uncertainty", but to us it is just a 
convenient function to have because of its role in Lemmas 7 and 8. 
We shall also need the inverse function H;'(y) (see Fig. 10.9) defined by 


x = H:'(y) iff y = HAx) 


for 0<x s}. 


Lemma 7. (An estimate for a binomial coefficient.) Suppose An is an integer, 
where 0<A<1. Then 


1 n 1 
JANAN < ( )< FORA. 16 
V8nA(1—A) An] = V/2and(1— A) 16) 


Proof. The proof uses Stirling’s formula for n! 


Vian e e "nc u30 e y! 


<VImn" entin (17) 


Therefore, with u — 1 — A, 


( n ) M n! 
Àn (An)!(un)! 
> l 1 -MI2nA —1/12nyu 
V2anÀp Au" 
SPEEA QN nH e Uma 


2mnÀu 


The LHS of (16) now follows after a bit of algebra. The proof of the RHS is 
similar and is left to the reader (Problem 10). 
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Remark. For most purposes simpler versions of Stirling’s formula, 


Vigne" <ni< Vian? entm (18) 


or 
n!~ Vrn" e" asn>œ 


are adequate. 


Lemma 8. (Estimate for a sum of binomial coefficients.) Suppose An is an 
integer, where 5< A < 1. Then 


Qn) n D 2 
————— = aym, (19) 
V8nA(1-— A) 2. k 


Proof. We first prove the RH inequality. For any positive number r we have 


Y () <> »(2) < > a) = (1+2. 


k=àn k=an 


Thus 


y (i) s {27 4 20} 


k-àn 


Choose r = log; (A/(1 — A)). Then this sum is 
s PEA] — A 4 AY = DMM, 


The LH inequality follows from 


and the previous lemma. Q.E.D. 


Corollary 9. For 0< u <3, 


* 


QnHyGo un (7 
V8nu(l-u) £5 


jar (20) 
Analysis of Justesen codes. 


Lemma 10. If we are given M distinct nonzero L-tuples, where 


M-y(2-1), O<y<l, 0<ô8<l, 
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then the sum of the weights of these L-tuples is at least 
Ly(2" — (Hz (8) - o(0). 
[Recall that f(L) = o(g(L)) means f(L)/g(L) 20 as L>~.] 


Proof. The number of these L-tuples having weight «AL is at most 


AL "i «2 0 
I 


by Corollary 9, for any 0<A <3. 
So the total weight is at least 


AL(M — 210) = ALM(1 Ja 2:0 M), 


Choose A = H;'(6 — 1/log L) = H2'(8) — o(1), with A <3. Then the total weight 
is at least 


LyQ* — 1) — o(0XH z(8) — o(1), 
= Ly(8™ — 1)(H2'(8) — o(1)). Q.E.D. 


Theorem 11. (Justesen.) Let R be fixed, 0< R <5. For each m choose 
K =[R-2N1=[R2(2” - II. (21) 


Then the Justesen code Jynx is a binary code of length n=2mN with rate 


mM 
R.-3NCOR 


and the minimum distance d,, satisfying 


ds = (1—2R)(H2'(0.5) — o(1)) 
7 0.110(1 — 2R). 


The lower bound on d,/n is linear in R and is plotted in Fig. 10.10. 


Proof. Let c be any nonzero codeword. As we saw earlier, c contains at least 
N — K + | distinct nonzero binary (2m)-tuples. From (21), 


N-K«i1- N(1-ÉCD)» wa -28. 
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GILBERT - VARSHAMOV BOUND 


THEOREM 
12 





(0) 0.2 0.4 0.6 0.8 1.0 
R 


Fig. 10.10. The Justesen codes compared with the Gilbert~Varshamov bound. 
We can now apply Lemma 10 with L = 2m, ô = 3, y = 1 - 2R, and deduce that 
wt (c) = 2m(1 — 2R)(2” — D(Hz'(0.5) — o(1)), 


dm 
2mN 





2 (1 —2R)(Hz'(0.5) — o(1)). Q.E.D. 


Theorem 11 only gives codes of rate less than 3. A larger class of codes can be 
obtained by puncturing fx in thé following way. Each component of the 
vector 


pene m . 2 . N- 
b = (do, Ao; Ay, @A1; Az, @7A2;..., aN au.) 


is expressed as a binary m-tuple as before, and then the last s binary digits 
are punctured from the alternate components 


ao aa, aa, ++: a" ay. 


The set of all such punctured vectors forms the Justesen code £X. This is a 
binary linear code with parameters 


n — (2m — s)N, k-2 mK. 
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We shall choose s later. This is a concatenated code for which the inner codes 
have rate 


Tn = 
" 2m-s' 

the outer RS code has rate Rus — K/N, and the overall rate of $% is the 
product 


= - mK 
Rm ES r,Rgas— (2m —s)N: 


As before the vector (do,.... an-ı) has 


K-1 
weight = n(i -<>). 
z N 
However, after puncturing each nonzero pair may occur as many as 2? times. 


The lowest weight would occur when each nonzero (2m — s)-tuples appears 
exactly 2° times. The number of distinct nonzero (2m — s)-tuples is then at 


least 
K> n N E 
Aa Re -n(1- 7). 
Given R we choose 
K= [RN Gm — 2] (22) 


m 


so that 
K-1 ROm — s) 
N m ` 





Therefore the number of distinct nonzero (2m — s)-tuples is 


R(2m — 2) 
——A t 


> e —1(1- 


From Lemma 10 with L=2m—-—s, 6=(m-s)/L, y=1—R(2m-—s){/m we 
deduce that the weight of the codeword is at least 


R(2m — s) 
m 


2Qm -s(i " Jor - IXH-*(8)— o(1), 


where the initial 2° is because each (2m — s)-tuple occurs 2° times (in the 
worst case). So for this code 


distance _ 2"7—2* -Remo :(#=*)- ) 
length ^c] (1- m H; 2m -s oq) 
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Choose an r such that i r « 1, and set 








- [Zor = y]. (23) 
so that 
pan yy 
" 2m-s 
Then as m > ^, 
Ser leg. 
m 
ar ol 
2" —] ý 


and the lower bound on distance/length approaches 
(1 -Fua —r) as mc. 


Finally we shall choose r to maximize this expression. Setting the derivative 
with respect to r equal to 0 we find that r should be chosen as the solution r, 
(say) of 


R- r 
1+ log; [1 — Hz (1— r)] 


provided r, zi. If r, <3, which happens when R « 0.30, take r =$. Thus we 
have proved 


(24) 





Theorem 12. (Justesen.) Let R be fixed, O< R <1. The punctured code $i x, 
with r equal to the maximum of 3 and the solution of (24), and s and K given 
by (23) and (22), has a lower bound to distancellength equal to 


(1- Suza- 5 as m oc. 


For comparison, the asymptotic form of the Gilbert-Varshamov bound 
(Theorem 12 of Ch. 1, Theorem 30 of Ch. 17) is that codes exist for which 
distance/length has a lower bound which approaches H; (1 — R) as length — o. 
(See Fig. 10.10.) Consider a point (r, Hz'(1 — r)) on this curve. The straight line 
joining the projections of this point on the axes, i.e. (r, 0) and (0, H;'(1— r)), 
has equation y —(1— R/r)H;'(1— r). The Justesen bound of Theorem 12 is 
the maximum of (1— R/r)Hz'(1— r) for 5er «1, and so is the envelope of 
these lines. This bound is also shown in Fig. 10.10. 

The bounds of Theorems 11 and 12 meet when R = 0.30. 

Finally, the following simple argument shows that concatenated codes exist 
which lie on or above the envelope of these lines for all Ox r «1. Consider 
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codes obtained by concatenating an inner [n,k,d] binary code and an 
[N, K, D] outer code over GF(2*). To avoid trivial cases we insist that both n 
and N approach infinity. The overall rate is R — kK/nN. 


Theorem 13. (Zyablov.) There exist concatenated codes with n>, N >œ, in 
which the outer code is maximal distance separable, and which satisfy 


distance E ) 
length ^ AX {(1 r)a: (1- r)f. 


Proof. Since the outer code is MDS, D= N-K +1, so 


D_,K,1 
N ON'N 
Take the inner code to meet the Garshamov bound, so d/n > H;'(1-— r), where 
r=k/n. Then for the overall code 
distance | dD ( R 
=> => — 


eR NS iy ee a 6 
length ^ nN~ ; =) He" n 


as n o, N>o, Q.E.D. 


This bound is indicated by the broken line in Fig. 10.10. For R » 0.30 
Justesen codes meet the bound. 


Research Problem (10.3). Give an explicit construction for concatenated codes 
which meet the bound when R « 0.30. 


Remark. Blokh and Zyablov have shown [165] that the class of all con- 
catenated codes (with n> and Nœ) contains codes meeting the 
Garshamov bound. As usual this proof is not constructive. 


Problem. (10) Prove the RHS inequality of (16). 


Notes on Chapter 10 


$2. The first time RS codes appear as codes is in Reed and Solomon [1106]. 
However, they had already been explicitly constructed by Bush [220] in 1952, 
using the language of orthogonal arrays. RS codes are extensively discussed 
by Forney [436]. Gore [547] has shown that in many situations the binary 
versions of RS codes have a lower error probability than binary BCH codes 
with the same parameters. 
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Solomon [1254] uses codewords of maximum weight in an RS code to 
encode messages over alphabets which ‘‘are not quite the right size, field- 
wise", ie. are not of prime power order, such as the English alphabet. 
Solomon [1250] and Reed et al. [1105, 1107] describe synchronizable codes 
constructed from RS codes. See also Ebert and Tong [402] and Wolverton 
[1432]. 


83. For extending RS codes see Bush [220], Gross [563], Tanaka and Nishida 
[1303] and Wolf [1427]. 


$5. Example 3 was found by Solomon [1252]; see also MacWilliams [879]. 


$6. Burst error correcting codes. Peterson and Weldon [1040] have an ex- 
cellent treatment. Some of the most important papers are: Bahl and Chien 
[57], Bridewell and Wolf [197], Burton [215], Burton et al. [218], Chien et al. 
[280, 285, 292), Elliott [407], Forney [440), Fujiwara et al. [462], Hagelbarger 
[573], Iwadare [684,685], Kasami [723], Kasami and Matora [743], Mandel- 
baum [897], Pehlert [1033], Posner [1069], Shiva and Sheng [1204], Tanaka 
and Nishida [1305], Tauglikh [1310], Tavares and Shiva [1314], Tong [1334— 
1336] and Wainberg and Wolf [1382, 1384]. 


$7. Other encoding methods for RS codes are described by Stone [1282]. 
$8. Delsarte [359] has studied generalized RS codes. 


$9. For the Chinese remainder theorem see Niven and Zuckerman [995, p. 33] 
or Uspensky and Heaslet [1359, pp. 189-191]. Bossen [188], Bossen and Yau 
[189], Mandelbaum [896, 898, 899], and Stone [1282] have described other 
redundant residue codes, which are also MDS codes over GF(q"). However, 
Reed-Solomon codes seem to be the most important codes in this class. 


$10. Burton [215], Gore [543, 547], Mandelbaum [896, 898, 899] and Yau and 
Liu [1446] have studied the decoding of RS codes. 


$11. Justesen's original paper is [704]; see also [705]. Sugiyama et al. [1285, 
1287] and Weldon [1405] (see also [1406]) have given small improvements on 
Justesen's construction at very low rates. Concatenated codes were in- 
troduced by Forney [436]. See also Blokh and Zyablov [165, 166], Savage 
[1151], Zyablov [1475, 1476] and $85 and 8.2 of Ch. 18. Problem 9b is due to 
Sugiyama and Kasahara [1286]. 

Lemma 7 is a special case of the Chernoff bound - see Ash [32], or Jelinek 
[690, p. 117]. Park [1021] gives an inductive proof of the RHS of (16). Massey 
[923] has given an alternative version of Lemma 10. Theorem 13 is due to 
Zyablov [1475] (see also [165, 166 and 1476]). 


MDS codes 


§1. Introduction 


We come now to one of the most fascinating chapters in all of coding 
theory: MDS codes. In Theorem 11 of Ch. 1 it was shown that for a linear 
code over any field, d<n—k+1. Codes with d=n-—k+1 are called 
maximum distance separable, or MDS for short. The name comes from the 
fact that such a code has the maximum possible distance between codewords, 
and that the codewords may be separated into message symbols and check 
‘-ymbols (i.e. the code has a systematic encoder, using the terminology of $7 
of Ch. 10). In fact any k symbols may be taken as message symbols, as 
Corollary 3 shows. MDS codes are also called optimal, but we prefer the less 
ambiguous term. 

In this chapter various properties of MDS codes will be derived. We shall 
also see that the problem of finding the longest possible MDS code with a 
given dimension is equivalent to a surprising list of combinatorial problems, 
none of which is completely solved — see Research Problem 11.1a to 11.1f. In 
Fig. 11.2 and Research Problem 11.4 we state what is conjectured to be the 
solution to some of these problems. 

In 82 of the preceding chapter it was shown that there is an [n =q- 
1,k,d =n—k + 1] Reed-Solomon (or RS) code over GF(q), for all k=1,..., 
n, and that these codes are MDS codes. Furthermore in $3 an overall parity 
check was added producing [n * l, k, n - k * 2] extended RS codes, also 
MDS. It is natural to ask if more parity checks can be added, while preserving 
the property of being MDS. The answer seems to be (885, 7 below) that one or 
two further parity checks can be added, but probably no more. More 
generally, we state the first version of our problem. 
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Research Problem (11.1a). Given k and q, find the largest value of n for which 
an [n, Kk, n - k - 1] MDS code exists over GF(q). Let m(k,q) denote this 
largest value of n. 


It will turn out that in all the known cases, when an [n, K, d] MDS code 
exists, then an [n, k, d] RS or extended RS code with the same parameters 
also exists. Thus as far as is known at present, RS and extended RS codes are 
the most important class of MDS codes. For this reason we don't give a 
separate discussion of decoding MDS codes but refer to $10 of Ch. 10. 


Problem. (1) Show that [n, 1, n], [n,n — 1,2] and [n, n, 1] MDS codes exist 
over any field. These are called trivial MDS codes. For a nontrivial code, 
2€&kzxn-2. 


$82. Generator and parity check matrices 


Let € be an [n, k, d] code over GF(q) with parity check matrix H and 
generator matrix G. 


Theorem 1. € is MDS iff every n — k columns of H are linearly independent. 


Proof. € contains a codeword of weight w iff w columns of H are linearly 
dependent (Theorem 10 of Ch. 1). Therefore € has d=n—k+1 iff no n—k 
or fewer columns of H are linearly dependent. Q.E.D. 


Theorem 2. If € is MDS so is the dual code ©". 


Proof. H is a generator matrix for €+. From Theorem 1, any n — k columns of 
H are linearly independent, so only the zero codeword is zero on as many as 
n — k coordinates. Therefore €^ has minimum distance at least k + 1, ie. it 
has parameters (n, n — k, k +1]. Q.E.D. 


Example. 
(o 0 1 3) 
Ola Bg 


is the generator matrix for a [4,2,3] MDS code € over GF(4)= 








Ch. 11. §3. Weight distribution of an MDS code 319 


{0, 1, a, B 2 a7). The dual code €^ has generator matrix 


(601) 


and is also a [4, 2,3] MDS code. 


Corollary 3. Let € be an [n, k, d] code over GF(q). The following statements 
are equivalent: 

(1) € is MDS; 

Gi) every k columns of a generator matrix G are linearly independent (i.e. 
any k symbols of the codewords may be taken as message symbols); 

(iii) every n-k columns of a parity check matrix H are linearly in- 
dependent. 


Proof. From Theorems 1 and 2. Q.E.D. 
The open problem can now be restated as 


Research Problem (11.1b). Given k and q, find the largest n for which there is a 
kx n matrix over GF(q) having every k columns linearly independent. 


Equivalently, in vector space terminology: 


Research Problem (11.1c). Given a k-dimensional vector space over GF(q). 
what is the largest number of vectors with the property that any k of them 
form a basis for the space? 


Problems. (2) Show that the only binary MDS codes are the trivial ones. 
(3) The Singleton bound for nonlinear codes. If € is an (n, M, d) code over 
GF(q), show that d «n —log, M +1. 


83. The weight distribution of an MDS code 


Surprisingly, the Hamming weight distribution of an MDS code is com- 
pletely determined. 


Theorem 4. Let € be an Ín, k, d] code over GF(q). Then € is MDS iff € has a 
minimum weight codeword in any d coordinates. 
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Proof. (Only if.) Given any n — k +1 coordinates, take one of them together 
with the complementary k — 1 coordinates as message symbols (which can be 
done by Corollary 3). Setting the single coordinate equal to 1 and the k — 1 to 
0 gives a codeword of weight n — k + 1. The proof of the converse is left to 
the reader. Q.E.D. 


Corollary 5. The number of codewords in € of weight n — k +1 is 
n 
«-9(. T. jJ 


An MDS code has k distinct nonzero weights, n — k 4 1,...,n, and the 
dual code has minimum distance d' = k + 1. Therefore by Theorem 29 of Ch. 
6, the codewords of weight d form a t-design, which however by Theorem 4 is 
just a trivial design. Theorem 4 of Ch. 6 also determines the weight distri- 
bution, but in this case it is easier to begin from the MacWilliams identities in 
the form of Problem (6) of Ch. 5, namely 


n-j 


x )4-exG I) 7-01. 


] 


Since A; =0 for Ixi n —k and A;=0 for Ix i € k, this becomes 


>> ("a - (P) - v. j20,L..., k- IL. 
isn-kel ] ] 


Setting j - k — 1 and k -2 gives 


Ask = (, Š 1) (q - V), 


k-1 
P = 2) An-K+1 + An 142 = s) (q^ — 1), 


Anan (LI 5) I 07 m ke 2 - DL 


It is not hard to guess (and to verify) that the general solution is 


Asker = (, s ;) Sen [x X ‘) (q^ - 1). 


1-6 d 


Hence we have 


Theorem 6. The number of codewords of weight w in an [n,k,d=n—k+1] 
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MDS code over GF(q) is 
= niN 41V w w-d+l-j _ 
a= (C)Ecv()uct-» 


= (") (q-1) $c pi a i) ate: 


We note that 
A -( : ) -D(q-n*k-1 
n-k«277 k-2 (q Xq n ). 


This number must be nonnegative, hence 


Corollary 7. Let € be an [n,kK, n - k * 1] MDS code. Ifk 22, q2n—k-l. If 
ksn-2, q2k+1. 


Proof. The second statement follows from examining the weight distribution 
of €. Q.E.D. 


Problems. (4) Prove the converse part of Theorem 4. 

(5) A real code consists of all linear combinations with real coefficients of 
the rows of a generator matrix G = (qi), where the q; are real numbers. 
Justify the statement that most real codes are MDS. 


Research Problem (11.2) What can be said about the complete weight 
enumerator (see $6 of Ch. 5) of an MDS code, or even of an RS code? 


$4. Matrices with every square submatrix nonsingular 


Theorem.8. An [n, k, d| code € with generator matrix G — [I| A), where A isa 
k x (n — k) matrix, is MDS iff every square submatrix (formed from any i rows 
and any i columns, for any i= 1, 2,..., min {k,n — kJ) of A is nonsingular. 


Proof. (>) Suppose € is MDS. By Corollary 3, every k columns of G are 
linearly independent. The idea of the proof is very simple and we shall just 
illustrate it by proving that the top right 3 x 3 submatrix A’ of A is nonsingular. 
Take the matrix B consisting of the last k —3 columns of I and the first 3 
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columns of A: 





Then det B = det A’ #0. The general case is handled in the same way. (€) 
The converse is immediate. Q.E.D. 


Examples. (1) The [4, 2, 3] code over GF(4) shown in Fig. 10.2 has 


-[8 a] 

E [4 gl 

and indeed every square submatrix of A (of size 1 and 2) is nonsingular. 
(2) The [5,2, 4] extended RS code over GF(5) has 


342 
a-f; al 


From Theorem 8 the next version of our problem is: 


Research Problem (11.1d). Given k and q, find the largest r such that there 
exists a k X r matrix having entries from GF(q) with the property that every 
square submatrix is nonsingular. 


Problem. (6) (Singleton.) Show that any rectangular submatrix A of the arrays 
in Fig. 11.1 has the property that any k X k submatrix of A is nonsingular over 
GF(q). 


LJ = M LN — 1 
AN RAW = 
ANA — 
An fk = 
AN — 

a= 


Fig. 11.1. 


Research Problem (11.3). Generalize Fig. 11.1. for larger q. 
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Problem. (7) (a) Show that any square submatrix of a Vandermonde matrix 
with real, positive entries is nonsingular. Show that this is not true for 
Vandermonde matrices over finite fields. 

(b) Given x,..., X, Y1,.--, y. the matrix C = (c) where cy = l/(x, + yj) is 
called a Cauchy matrix. Show that 


Ii) œ xXy; y) 
det (C) === 
JL (% + yi) 


Hence, provided the x, are distinct, the y; are distinct, and x; + y; 0 for all i, j, 
it follows that any square submatrix of a Cauchy matrix is nonsingular over 
any field. 


§5. MDS codes from RS codes 


Let a,,...,@ -; be the nonzero elements of GF(q). The [q,k,d ^ q- k * I] 
extended RS code of $3 of Ch. 10 has parity check matrix 


DT Qa- 0 
H, > ai az 0 
atQ*' ask" 0 


One more parity check can always be added, producing a [q+ 1,k,q—-—k+ 2] 
MDS code, by using the parity check matrix 


1 1 10 
Qi Qa-1 0 0 

H;-|oi -:::o0;400| (1) 
a att 01 


To show this, we must verify that any q —k + 1 columns of H, are linearly 
independent, i.e. form a nonsingular matrix. In fact, any q — k + 1 of the first 
q—] columns form a Vandermonde matrix (Lemma 17 of Ch. 4) and are 
nonsingular. Similarly, given any q — k + 1 columns which include one or both 
of the last two columns, we expand about these columns and again obtain a 
Vandermonde matrix. 

In fact, there exist cyclic codes with the same parameters. 


Theorem 9. For any k, | &k <q+1, there exists a [q * l, k, q - k * 2] cyclic 
MDS code over GF(q). 
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Proof. We only prove the case q = 2", the case of odd q being similar. To 
exclude the trivial cases we assume 2=k =q- 1. Consider the polynomial 
x?" + 1 over GF(2”), The cyclotomic cosets, under multiplication by 2" and 
reduction modulo 2" + 1, are 


(0j 
{1,2"}= {1,- 1} 
{2, 2" — 1} = {2, -2} 


{27,27 ee orth, 


For example, if 2" + 1 = 33 we have the cyclotomic cosets 


{0} 

{1,32} (9,24) 
{2,31} (10,23) 
{3,30} (11,22) 
(4,29) (12,21) 
{5,28} (13,20) 
{6,27} (14,19! 
{7,26} (15,18) 
{8,25} (16,17) 


Thus x7"*'+1 has, besides x +1, only quadratic factors over GF(2"), and 
these are of the form 


x’ +l +a xl (xta Y(x-4a7), 


where a is a primitive (2" + 1)-st root of unity. Now a € GF(2’”); in fact if £ 
is a primitive element of GF(2’") we may take a = £""'. 


Problem. (8) With this value of a show a'/4 a ' is in GF(2"). 


Now consider the (2" 1,2" -1—2t— 1] cyclic code with generator 
polynomial 


(x + D [Toe + (a +a ‘)x +1). 


This has 2t + 1 consecutive zeros 


t-1 t 


-t t+ -1 
-Q&Q ,la,...,a@" ,a 


thus by the BCH bound the minimum distance is at least 2t - 2. Since 
n-2"4],k-2"41-2t—1, n- k * 1 22t 42, and the code is MDS. This 
constructs the desired codes for all even values of k. 
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Similarly the code with generator polynomial 
2m- 


[| G?+(ai+a)x+4+1) 


ie27 i-r 


has the 2t consecutive zeros 


2m-i-t41 Piel 2m-1+1 2m-let 
EE? 4 > a slave a pal ^ 


and is a [2" + 1,2" + 1 —2t, 2t + 1] MDS code. This gives the desired codes for 
all odd k. Q.E.D. 


Example. Codes of length n = 9 over GF(2’). Let £ be a primitive element of 
GF(25, B = £ a primitive element of GF(2°), and a = £ a primitive 9" root of 
unity. Then from Fig. 4.5, 8? = B?+ 1, and 


ato =g +E" =B, a^ ta? EE pP, 
+a Het EF=1, atta t= PH o ps 
Therefore 
x*-12 (x DG? x + DG + Bx + DG? + Bx + DG? + B*x + 1). 


The code over GF(8) with check polynomial x?+x+1 is degenerate (for 
X? x + l divides x? + 1, see Lemma 8 of Chapter 8). It is a [9, 2, 6] code with 
idempotent x + x?-- x*-- x^ x? * x5, or 011011011 using an obvious notation. 

The other three codes of dimension 2 are [9, 2, 8] codes. Their idempotents 
are readily found to be: 


Idempotent Check polynomial 


l. x x^ dxUox* Ox X^ x4 ox? 


0 p? p? 1 p p 1 p‘ p x7+ Bex + 1 

0 pg p 1 p‘ p* 1 p p x? + Box + 1 

0 p* pg 1 p p 1 p p? x7+ Box + 1 
The codes with these idempotents are minimal codes (§3 of Ch. 8) and consist 
of the 9 cyclic shifts of the idempotent and their scalar multiples. The code 
with generator polynomial x?-- B°x +1 and zeros o^, a * is a [9, 7, 3] code. 
The polynomial (x + 1)(x?+ g?x - 1) has zeros oa ', 1, a and generates a 
[9, 6, 4] code; the polynomial (x?-- x + 1)(x?+ B°x + 1) has zeros a’, a^, a, a® 
and generates a [9, 5,5] code, and so on. 


Problem. (9) Find the idempotents and weight distributions of these codes. 


The case k = 3 and q even. There are just two known cases when another 
parity check can be added: when q = 2” and k=3 or k-q- I. 
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Theorem 10. There exist [2" +2, 3, 2"] and [2" +2, 2" — 1,4] triply-extended 
RS (and MDS) codes. 


Proof. Use the matrix 


100 
Qi t a 1010 (2) 
a? --- a, 001 
as either generator or parity check matrix. Any 3 columns are linearly 
independent, since the a? are all distinct. Q.E.D. 


$6. n-arcs 


There is also a connection between MDS codes and finite geometries. From 
Corollary 3 we see that the problem of finding an [n, k, n — k + 1] MDS code € 
can be looked at as the geometric problem of finding a set S of n points in the 
projective geometry PG(k — 1, q) (see Appendix B) such that every k points 
of S are linearly independent, i.e. such that no k points of S lie on a 
hyperplane. The coordinates of the points are the columns of a generator 
matrix of «€. 

For example, the columns of (2) comprise 2" +2 points in the projective 
plane PG(2, 2") such that no three points lie on a line. 


Definition. An n-arc is a set of n points in the geometry PG(k — 1, q) such that 
no k points lie in a hyperplane PG(k — 2, q), where n =k = 3. E.g. (2) shows a 
(2" + 2)-arc in PG(2, 2”). 


Thus another version of our problem is: 


Research Problem (11.1e). Given k and q, find the largest value of n for which 
there exists an n-arc in PG(k — 1, q). 


There is an extensive geometrical literature on this problem, but we restrict 
ourselves to just one theorem. 


Theorem 11. If € is a nontrivial [n, k = 3,n — k + 1] MDS code over GF(q), q odd, 
then n qc k—2. Equivalently, for any n-arc in PG(k—1,q), q odd, ns 
q+k-2. 
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Proof. Let G = (g,;) be a k x n generator matrix of €, let r,,...,r, denote the 
rows of G, and let C be the 0-chain in PG(k — 1, q) consisting of the points 
whose coordinates are the columns of G. A generic point of PG(k — 1, q) wili 
be denoted by (x,,..., Xx). It is clear that the hyperplane x; =0 meets C in 
those points for which g, = 0. Therefore the weight of the first row, rı, of G 
equals n — number of points in which the hyperplane x, = 0 meets C. Similarly 
the weight of the codeword XE, A:r; of € equals n — number of points in which 
the hyperplane Ek, Aix, = 0 meets C. 

We know from Corollary 7 that n £q +k- 1. Suppose now n—-q-k- 1, 
which implies A,.,,; — 0. Then C meets the hyperplanes of PG(k — 1, q) in 
k—1, k-3, k—4,...,1, or 0 points (but not k —2 since there are no 
codewords of weight n — k + 2). 

Pick a hyperplane which contains k — 3 points of C, say P,,..., Px_3. Let 5 
be a subspace PG(k—3,q) lying in this hyperplane and containing 
P,,..., Pu... Any hyperplane through X must meet C in 2 or 0 more points. 
Let r be the number which meet C in 2 more points. The union of all these 
hyperplanes is PG(k — 1, q), so certainly contains all the points of C. There- 
fore 


art+k—-3=qr+k-1, 


or 2r = q +2, which is a contradiction since q is odd. Q.E.D. 


$7. The known results 


For k 23 and q odd, Theorem 11 says n<q+1, and so the codes of 
Theorem 9 have the largest possible n. Thus one case of Research Problem 
11.1a is solved: m(3,q)- q- 1 if q is odd. 

On the other hand for k = 3 and q even, n <q +2 by Corollary 7, and the 
codes of Theorem 10 show that m(3,q)=q+2 if q is even. 

The results for general k are best shown graphically. Figure 11.2 gives the 
values of k and r for which an [n = k +r, k] MDS code over GF(q) is known 
to exist (thus r is the number of parity checks). 

By Corollary 2 the figure is symmetric in k and r. Apart from the codes of 
Theorem 10, no code is known which lies above the broken line n=k+r= 
q * 1 (rz2,k z2). Codes above the heavy line are forbidden by Corollary 7. 

There is good evidence that the broken line is the true upper bound, and we 
state this as: 


Research Problem (11.4). Prove (or disprove) that all MDS codes, with the 
exception of those given in Theorem 10, lie beneath the linen - k *&r-q-*1 
in Fig. 11.2. 


328 MDS codes Ch. 11. §8. 























1 2 3 ste q-2 q-1 q qt! 
— 
k 


Fig. 11.2. The best (n =k +r, k] MDS codes known. @ means a code exists for all q; X means a 
code exists iff q = 2". 


This is known to be true for codes with k <5, or q = 11, or q > (4k —9y, 
and in some other cases. 
Stated in terms of the function m(k, q) the conjecture is that 


|. fq* 1for 2<k <q, 
mi a) = [27 1 tor a<k, 2 
except for 
m3,q=m(q-l,q)=q+2 if q=2". (4) 


$8. Orthogonal arrays 


Definition. An M x n matrix A with entries from a set of q elements is called 
an orthogonal array of size M, n constraints, q levels, strength k, and index À 
if any set of k columns of A contains all q* possible row vectors exactly A 
times. Such an array is denoted by (M, n, q, k). Clearly M = Aq*. The case 
q —2 was considered in Theorem 8 of Ch. 5. 


Examples. The code s% of Fig. 2.1 is a (12, 11, 2, 2) (see Theorem 8 of Ch. 5). 
Fig. 11.3 shows a (4, 3,2, 2), and the codewords in Fig. 10.1 form a (16, 3, 4, 2) 
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with entries from GF(4). 


111 
A ee 
[ise 
sie 


Fig. 11.3. A (4, 3, 2,2) orthogonal array. 


Theorem 12. The rows of a (q', n, q, k) linear orthogonal array A of index 
unity and symbols from GF(q) are the codewords of an [n, k] MDS code over 
GF(q), and conversely. 


Proof. Any q“ X k submatrix of A contains each k-tuple exactly once © the 
corresponding k coordinates can be taken as message symbols © the code is 
MDS, by Corollary 3. Q.E.D. 


Problem. (10) Show that if Ha, is a normalized Hadamard matrix of order 4A 
($3 of Ch. 2), then the last 4A — 1 columns of H4, form a (44,4A —1,2,2) 
orthogonal array of index A. Fig. 11.3 is the case A = 1. 


Thus the final version of our problem is: 


Research Problem (11.1f). Find the greatest possible n in a (q‘,n, q, k) 
orthogonal array of index unity. 


Notes on Chapter 11 


§1. Singleton [1214] seems to have been the first to explicitly study MDS 
codes. However in 1952 Bush [220] had already discovered Reed-Solomon 
codes and the extensions given in Theorems 9 and 10, using the language of 
orthogonal arrays (§8). 

Some other redundant residue codes besides RS codes are also MDS - see 
$9 of Ch. 10. 

Assmus and Mattson [41] have shown that MDS codes whose length n is a 
prime number 7 are very common, by showing that every cyclic code of 
length z over GF(p') is MDS for all i, for all except a finite number of primes 
p. 

Without giving any details we just mention that an MDS code with k = 2 is 
also equivalent to a set of n — k mutually orthogonal Latin squares of order q 
(Denes and Keedwell [371, p. 351], Posner [1068], Singleton [1214]). Therefore 
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the more general problem of finding MDS codes over alphabets of size s (i.e. 
not necessarily over a field) includes the very difficult problem of finding all 
projective planes! (See Appendix B.) 


§3. The weight distribution of MDS codes was found independently by 
Assmus, Gleason, Mattson and Turyn [50], Forney and Kohlenberg [436], and 
Kasami, Lin and Peterson [736]. Our derivation follows Goethals [491]. See 
also [933]. 

Corollary 7 is due to Bush. It is also given in [1214] and by Borodin [172]. 
Our proof follows Robillard [1118]. 


$4. For Problem 7 see Knuth [772, p. 36] and Pólya-Szegó [1067, vol. 2, p. 45]. 
$5. See the Notes to $3 of Ch. 10. 


$6. Segre [1170-1173] and later Thas [1316, 1317], Casse [252], Hirschfeld 
[655] and many others have studied n-arcs and related problems in finite 
geometries. Two recent surveys are Barlotti [69,70]. See also Dowling [385], 
Gulati et al. [565-569]. 

Using methods of algebraic geometry Segre [1170; 1173, p. 312] and Casse 
[252] have improved Theorem 11 as follows: 


Theorem 13. Assume q 2k +1. 
(i) If k 23, 4 or 5 then (3) and (4) hold. 
Gi) If k 26 then m(k,q) &q* k—4. 


Thas [1316] has shown: 


Theorem 14. For q odd and q > (4k —9yY, m(k,q) -q«4 I. 


Maneri, Silverman, and Jurick ([904, 905, 703]) have shown 


Theorem 15. (3) and (4) hold for q < 11. 


Other conditions under which (3) and (4) hold are given by Thas [1318]. 

it also follows from the geometrical theory that if q is odd then in many 
(conjecturally all) cases there is an unique [n = q + 1, k, q — k +2] MDS code. 
But if q is even this is known to be false. 

In a projective plane of order h, a set C of h + 1 points no 3 of which are 
collinear is called an oval. Segre [1171; 1173, p. 270] has shown that in a 
Desarguesian plane of odd order (i.e. A = q, an odd prime power) these points 
form a conic. For example, if we take the columns of the generator matrix ((2) 
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without the penultimate column) of the [q. 1,3, q - 1] MDS code as the 
points, we see that they satisfy the equation xi- xix. 

If however q = 2", all lines which meet the oval C in a single point are 
concurrent; the point in which they meet is called the nucleus or knot of C. 
The points of C together with the nucleus give 2" + 2 points no 3 of which are 
collinear, and give a [2” + 2, 3, 27] MDS code with the same parameters as the 
code given in Theorem 10. 








Alternant, Goppa and other 
generalized BCH codes 


$1. Introduction 


Alternant codes are a large and powerful family of codes obtained by an 
apparently small modification of the parity check matrix of a BCH code. 
Recall from Ch. 7 that a BCH code of length n and designed distance 6 over 
GF(q) has parity check matrix H = (Hj) where H; =a! (1<i<5-1,0<j< 
n — 1) and a € GF(q") is a primitive n-th root of unity. By changing H; to 
a; y, where* a —(a,,...,0,) is a vector with distinct components from 
GF(q"), and y *(y,..., y.) is a vector with nonzero components from 
GF(q") we get the alternant code (a, y). The properties of this code are 
summarized in Fig. 12.2. 

The extra freedom in the definition is enough to ensure that some long 
alternant codes meet the Gilbert-Varshamov bound (Theorem 3), in contrast 
to the situation for BCH codes (Theorem 13 of Ch. 9). 

In fact, it turns out that alternant codes form a very large class indeed, and 
a great deal remains to be discovered about them. For example, which 
alternant codes meet the Gilbert-V arshamov bound? How does one find the 
true dimension and minimum distance? 

Of course BCH codes are a special case of alternant codes. In sections 3 to 
7 various other subclasses of alternant codes are defined, namely Goppa 
($3-85 - see Fig. 12.3 for a summary), Srivastava (86, especially Fig. 12.4) and 
Chien-Choy generalized BCH (87 and Fig. 12.5) codes. The rather com- 
plicated relationship between these codes is indicated in Fig. 12.1 (which is 
not drawn to scale). 

The encoding and decoding of alternant codes is similar to that of BCH 


*We apologize for using a in two different ways. 
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ALTERNANT GOPPA CODES 


CODES 






GENERALIZED 


SRIVASTAVA 
CODES NARROW - SENSE, 
PRIMITIVE 
8CH CODES 


SRIVASTAVA 
CODES 





CHIEN- CHOY BCH CODES 


GBCH CDDES 


Fig. 12.1. Relationship between various subclasses of alternant codes. 


codes, and is discussed in $9. The key step in decoding is the use of the 
Euclidean algorithm to go from the syndrome vector to the error locator and 
error evaluator polynomials. Of course this method applies equally well to 
BCH codes, and so we are able to fill in the gap in decoding BCH codes that 
was left in $6 of Ch. 9. 

The Euclidean algorithm, for finding the greatest common divisor of two 
integers or polynomials, is described in $8. As a consequence of the algorithm 
we also are finally able to prove the result (Corollary 15), used several times 
in earlier chapters, which says that if f and g are relatively prime polynomials 
(or integers), then there exist polynomials (or integers) U and V such that 


Uf + Vg = Il. 
$82. Alternant codes 


Alternant codes are closely related to the generalized Reed-Solomon codes 
GRS, (o. v) of 88 of Ch. 10. For convenience we repeat the definition. Let 
a —(a,...,0o,) where the o; are distinct elements of GF(q"), and let v = 
(9, .... Va) Where the v; are nonzero (but not necessarily distinct) elements of 
GF(q"). Then GRS,(a, v) consists of all vectors 


(v, F(a), v.F(a:)...., Un’ (a,)) (1) 
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where F(z) is any polynomial of degree « k, with coefficients from GF(q”). 
GRS,,(a@, v) is an [n, ko, r+ 1] MDS code over GF(q"), where r = n — ko, and has 
parity check matrix 
yı Yo 701 Mn 
O1Y: O2y2 ^7 BnYn 
H =| ay, ayz +++ anya |, 


ay yi an Yn 
1 1 I y 0 
Q@, 03 An ya 
=| at ai ++: a n (2) 
at ast ++: antl] 10 Yn 
— XY (say), 
where y = (y,,..., yn), with y; € GF(q") and y;* 0, is such that GRS,,(a@, v) = 


GRS, (a, y). 


Definition. The alternant code £ (æ, y) consists of all codewords of GRS,,(a, v) 
which have components from GF(q), i.e. s£(o, y) is the restriction of 
GRS,,(a, v) to GF(q). Thus (a, y) consists of all vectors a over GF(q) such 
that Ha? —0, where H is given by (2). 


A parity check matrix H with elements from GF(q) can be obtained by 
replacing each element of H by the corresponding column vector of length m 
from GF(q), just as was done for BCH codes. Since #(a, y) is a subfield 
subcode of GRS,,(a, v) it follows from $7 of Ch. 7 that % (æ, y) is an [n, k, d] 
code over GF(q) with 


n-mrsk<n-r, d2r+l. 


(The properties of (a, y) are collected in Fig. 12.2.) It is possible to obtain 
this estimate on d directly from the parity check matrix. 


Theorem 1. x /(o, y) has minimum distance d z r4 l. 


Proof. Suppose a is a nonzero codeword of (a, y) with weight <r. Then 
Ha! = XYa! = 0. Set b? = Ya’, then wt (b) = wt(a) since Y is diagonal and 
invertible. Thus Xb’ — 0, which is impossible since X is Vandermonde, by 
Lemma 17 of Ch. 4. Q.E.D. 
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(a, y) is defined by the parity check matrix (2) (or (3)), where 
Q\,...,@, are distinct elements of GF(q") and y,,...,y, are 
nonzero elements of GF(q"). (a, y) is a linear code over GF(q) 
with 


length n, dimension k z n — mr, 
minimum distance d z r 4 l. 


5f (a, y) is the restriction of GRS,,(a, v) to GF(q), where ko= n - r. 
The dual (a, y)" = T,(GRS, (o, y)) (Theorem 2). There exist 
long alternant codes which meet the Gilbert-Varshamov bound 
(Theorem 3). Important subclasses of alternant codes are BCH, 
Goppa (883-5), Srivastava (86), and Chien-Choy generalized BCH 
($7) codes. For decoding see $9. 








Fig. 12.2. Properties of alternant code (a, y). 


Let C = (cj), cj € GF(q") be any invertible matrix. Then an equally good 
parity check matrix for (a, y) is (from Problem 31 of Ch. 7). 


H' = CXY, 
Cu €» Cir 1 1 1 y 0 
= | Car C22 €» a, Q0 Qn y? 
Cri Cr2 Cr ay! as! a. 0 Yn 


yugi(o). yogi(a2) > YnBi(On) 
yigo(ai) yoga(a2) ^^^ y«gx(o.) (3) 


yig (a) yog (a3) ` `` y«g (os) 


say, where 


gi(x) = Ci + CioX + cox? + e 2 cux" (i = l, t5 r) (4) 


is a polynomial of degree <r over GF(q"). 
Note from (2) or (3) that it is natural to label the coordinates of the 
codewords by a@,...,@,. This is useful for encoding and decoding. 


Examples. First put a =(l,a,a’,...,a@°), y 2-(1,1,...,1) where a is a 
primitive element of GF(2). Then if r=2 the alternant code »/(o, y) has 
parity check matrix (2) equal to 

a-x-[!!! Dog ee 


2-3 5 6| 
la a’ a’ ataa 
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Replacing each entry by the corresponding binary vector of length 3 we 
obtain HI 


0000000 
0000000 
1001011 | 
0101110 
0010111 


Thus #(a,y) is a [7,3,4] code. On the other hand, if y= 
(1, a, a’, a, at, a7, a5) then 


"X 
Il 


H= x¥ =[ 


la aa at a? e] 


2 4 26 5 
læa’ aaa `a 


The second row of this matrix is redundant, for if 


y hidi = 0 
ei 


where h; € GF(2") and a, =0 or 1 then 


> hia; = (> ha) - 0. 


Thus we can take 


1001011 
H = |0101110|, 
0010111 


and (a, y) is now a [7,4,3] Hamming code. (In this case the effect of y has 
been to decrease the minimum distance and increase the number of in- 
formation symbols.) 

Other examples of alternant codes are BCH codes. For the parity check 
matrix of a general BCH code is (Equation (19) of Ch. 7) 





l a a2? e. gin Dh 
H = 1 o^! ger EN yt DetD 
1 o*^*7? (n-1X55-2) 
11 l |] 1 0 
1 a a? a"! a’ 
m 1 a? a * Xn-1) a? ; 
[1 ae? "PE ao 7-7» 0 gine de 


which is an alternant code. 
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The dual of an alternant code. 


Theorem 2. The dual of the alternant code sf(a, y) is the code 
Tm (GRS, (a, y)) = T, (GRS(a, v)*). (5) 


Recall from $7 of Ch. 7 that if €" is any code over GF(q"), the trace code 
T..(€") is the code over GF(q) consisting of all distinct vectors 


(Tamli)... T4, (c) Where (Cis... Cn) EG”. 
Proof. From Theorem 11 of Ch. 7 and Theorem 4 of Ch. 10. Q.E.D. 


An illustration of this theorem will be found following Theorem 11 of Ch. 
7. 


Long alternant codes are good. 


Theorem 3. (a) Given n, h and 8, let m be any number dividing n — h. Then 
there exists an alternant code S(a,y) over GF(q) with parameters [n, k > 
h, d = ô] provided 


5-1 


t w fn m _ ]Yyn-hym 

3 a-o (2) «t7 - n. 6) 
(b) Hence there exist long alternant codes which meet the Gilbert- 

Varshamov bound. i 


Proof. (i) Let a be any vector over GF(q). The number of codes GRS,,(a, v) 
which contain a, for fixed ko and a, and varying v, is at most (q" — 1)*. To 
show this, observe from (1) that GRS,,(a@, v) contains a iff 

a; t 
adt DS0; | 
for some polynomial F(z) of degree < ko. Since F(z) is determined by its 
values at <ky points, we can choose at most ko vs before F(z) is 
determined. There are q” — 1 choices for each of these v;'s. 

(ii) Consider the family F of alternant codes s£(a, y) which are restrictions 
of some GRS,(a, v) to GF(q), where ko,» n — (n — h) m. Then s£(a, y)& F 
has dimension k zn — m(n —ko)- h. From (i) the number of s£(a, y)& F 
which contain a is at most (q" — 1)" "^""", Therefore the number of 
(a, y) € F with minimum distance <ô is at most 


(a^ - yc S a y (7). (7) 
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(ii) The total number of codes (a, y) in F is equal to the number of 
choices for v, which is (q” — 1)". So if this number exceeds (7), there exists an 
(a, y) with dimension >h and minimum distance = ô. This proves (a). 
Asymptotically, when n is large and h/n fixed, (6) is the same as the 
Gilbert-V arshamov bound (Theorem 12 of Ch. 1, Theorem 30 of Ch. 17). 

Q.E.D. 


Of course Theorem 3 doesn't say which alternant codes are the best, only 
that good ones exist. Since the class of alternant codes is so large, it is useful 
to have names for some subclasses. In the following sections we shall 
describe the subclasses known as Goppa, Srivastava and Chien-Choy 
generalized BCH codes. These are obtained by placing restrictions on a or y, 
or both. 


Problem. (1) Consider the binary alternant code with n —6, o, — a'*' for 
i 2 0,...,5 where a is a primitive element of GF(25, all y; = 1, g(x) 2 19x 
and gi(x) =x. Show that this is a [6, 2, 4] code. 


$3. Goppa codes 


This is the most interesting subclass of alternant codes. Just as cyclic codes 
are specified in terms of a generator polynomial (Theorem 1 of Ch. 7), so 
Goppa codes are described in terms of a Goppa polynomial G(z). In contrast 
to cyclic codes, where it is difficult to estimate the minimum distance d from 
the generator polynomial, Goppa codes have the property that d> 
deg G(z)+ 1. We first give the definition in terms of Goppa polynomials and 
then show that these are alternant codes. 

The definition of a Goppa code of length n with symbols from GF(q) calls 
for two things: a polynomial G(z) called the Goppa polynomial, having 
coefficients from GF(q"), for some fixed m, and a subset L — (a ..., æn} of 
GF(q") such that G(a;) x 0 for all œ; € L. Usually L is taken to be all the 
elements of GF(q") which are not zeros of G(z). 

With any vector a=(a,,...,a,) over GF(q) we associate the rational 
function 


n 


R,(z) - X ——. (8) 


f=1 2 — Qi 





Definition. The Goppa code T(L, G) (or I) consists of all vectors a such that 
R,(z) 23 0 mod G(z), (9) 


or equivalently such that R,(z)=0 in the polynomial ring GF(q")[z]/G(z). 
If G(z) is irreducible then I is called an irreducible Goppa code. 
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Figure 12.3 shows the basic properties of these codes. Examples will be 
given after Theorem 6. 


I'(L, G) is a linear code over GF(q), defined by Equation (9). 


length n = |L] 
dimension k > n — mr, r = deg G(z) 
minimum distance d z r+. 


I(L, G) 2 alternant code #(a,y) where y, = G(aj)) '. T(L, GY = 


T,,(GRS,(a, y)). In the binary case if G(z) has no multiple zeros 
then d z2r- 1. There exist long Goppa codes which meet the 
Gilbert-Varshamov bound. Extended binary double-error-cor- 
recting Goppa codes are cyclic ($5). 


Fig. 12.3. Properties of the Goppa code T(L, G). 





The parity check matrix of T. It is obvious that F is a linear code. The parity 
check matrix can be found from (9). For in the ring of polynomials mod G(z), 
z — a, has an inverse (since it does not divide G(z)). The inverse is 


_ G(z)- G (ai 


G-a’ = -20-8 Gia 


for indeed 
-(z- a, CO-a G(æ:)' = 1 mod G(z). 
Therefore a is in T(L, G) iff 
$ pE 00765) i26 (10) 


i=t z-— ai 


as a polynomial (not mod G(z)). If G(z) = Zi-o gız‘, with gi € GF(q") and g, z 0, 
then 
9u)-6() _ g(z"'*z' 7a t--- tar )tg. (Z7 tap) oe 
+ g(z + ai) + gı. 


Equating the coefficients of z^ ', z^ ?,...,1 to zero in (10) we see that a is in 


IL, G) iff Ha" =0, where 


2,G(a,)' NEN gG(a.)' 
H = | (gı + aig,)G(a,)' t (g,-1+ ang, )G (any ' 


(gag +: +a 'g)G(ai)' +++ (81+ agito ta; g)G(a,)' 
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m 0 sp TP do -1l G(o y" 0 

£a £g O0 ++: OO} la, eo -an G(a,)"' 

£ £a B Ojja? af -a i 

& R 83 °°: g| jar’ oa --- an} [0 G(o,) 
=CXY (say), (11) 


is a parity check matrix for [(L, G). Since C is invertible, by Problem 31 of 
Ch. 7 another parity check matrix is 


H'= XY 
G(o) t + Glan)" 
= ja,G(a,)' +++ a,G(a,)' |, (12) 
X wes Eee 


and this is usually the simplest to use. 

A parity check matrix with elements from GF(q) is then obtained by 
replacing each entry of H (or H") by the corresponding column vector of 
length m from GF(q). 

Comparing (11) with (3) we see that I'(L, G) is an alternant code W(a, y) 
with a =(a,,...,a@,) and y - (G(a) ,..., G(a,) "). Therefore I'(L, G) has 
dimension k >n — rm and minimum distance d z r ^ 1. 

In fact it is easy to find the generalized Reed-Solomon code which produces 
I(L, G). 


Theorem 4. I'(L,G) is the restriction to GF(q) of GRS,-_,(a,v), where v = 
(0,,..., Un) and 


pce... i-1,...,n. 


I] (a; — aj) 


j*i 


Proof. (i) Take u € GRS,_,(a, v) | GF(q). Then 
F(ai)G(ai) 


II (o; — di. 


pti 


u: = vF (a) = 


where F(z) is a polynomial of degree < n — r. Thus 


ui 
: -z— — 2, Fac IG L 
iziZ — Qj [e-a Pp a 
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Let M 
N(G)- 2, F(a)G(a) | G - as - as). 


Then N(a;)=F(ai)G(ai) for i-l,...,n. Also degN(z)&n-1 and 
deg F(z)G(z) =n - 1. Since the polynomial N (z) — F(z)G(z) is determined by 
its values at n points, N(z) = F(z)G(z). Therefore 


e» u _ F(z)G(z) 
i22 Qi [I e-a) 


and hence u € l'(L, G). Thus 
I'(L, G) D GRS,- (a, v) | GF(q) 


(ii) The converse is similar and is left to the reader. Q.E.D. 
From Theorem 2 we obtain: 


Theorem 5. The dual of a Goppa code is given by 


I'(L, G)' = T.(GRS,(a, y)) (13) 
where y; = G(a;) '. 


Problem. (2) Prove directly that GRS,(o, y), where y;=G(a,)', and 


GRS,,-,(a@, y), where 


yi- G (a) 
I (aj — ai) 


are dual codes. 


Binary Goppa codes. Just as for BCH codes one can say a bit more in the 
binary case (cf. $6 of Ch. 7). Suppose F = I'(L, G) is a binary Goppa code 
(with q=2). Let a-(a,::-a,) be a codeword of weight w in T, with 
à,7:::— 4, = 1, and define 


£o - [] e-a). (14) 


Then w 


-$1 fC) 
R,(z)= » zar f from (8) (15) 


The o;'s are distinct. from the definition of T, so fi(z) and f,(z) have no 
common factors, and (15) is in lowest terms. Since G(a;) # 0, fa(z) and G(z) 
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are relatively prime, and so from (15) 
R,(z) & 0 mod G(z) iff G(z) | fi(z). 
We are working mod 2, so f;(z) contains only even powers and is a perfect 
square. Let G(z) be the lowest degree perfect square which is divisible by 
G(z). Then - 
G(z)| f2) ift G) | fia). 


We conclude that 
a ET iff R,(z) =0 mod G(z) 


iff G(z) | f2). (16) 
In particular, if az 0, deg f(z) = deg G(z). Hence 
min. distance of P = deg G(z) + 1. (17) 


An important special case is: 


Theorem 6. Suppose G(z) has no multiple zeros, so that G(z) = G(zy. Then 
min. distance of l = 2 deg G(z) + 1. (18) 


If G(z) has no multiple zeros then I is called a separable Goppa code. 


Examples of binary Goppa codes. (1) Take G(z) =2’+z+1, L=GF(2’)= 
(0,1,0,..., a5) where @ is primitive, q —2, and q” =8. Certainly G(B) #0 
for B € GF(2°), for the zeros of z7+ z+ 1 belong to GF(2”), GF(25, GF(2°),... 
but not to GF(2)) - see Theorem 8 of Ch. 4. We obtain an irreducible Goppa 
code I of length n =|L| = 8, dimension k > 8—2.3 = 2, and minimum distance 
d> 5. From (12) a parity check matrix is 


| 1 1 1 


G(0) G(1) G(a) G(a*) 
0 1 a $ 


a 
G(0) G(1) G(a) G (af) 
From Fig. 4.5 we find 


H- ll@ataaa 24 
"(01a^oa*a^o2a*a 
0 1 «a a? a? at a? at 
11000000 
0001 O 1 1 1 
[0011 1001 
011 1] 1 1] 1 1 
0010 1] 1 0 I 
0001 1] 1 1 O0 
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The codewords are 


so this is an [8,2,5] Goppa code. By adding an overall! parity check and 
reordering the columns the following [9, 2, 6] code is obtained: 


la ab oa a 
00 0000000 
01 1 01 1 01 I (19) 
10 1 10 1 10 1! 
11 0 11 0 11 


This code is cyclic! An explanation for this phenomenon will be given in $5. 

This example can also be used as an illustration of Theorem 4. Here 
n—r=8—2=6, and v7 G(a) =a" +a' +1, since Malaj — o;) ^ 1 for all i. 
Thus Theorem 4 states that the [8, 2, 5] Goppa code is the restriction to GF(2) 
of the code over GF(2) with generator matrix 


lla aa atata 
Olav aña a’ a’ a? 
011 1 al @a 
Ola o?1 o*1 1 
01 a? a^ a? a. a? a* 
0 1 a? af a* a? a? a? 
It is readily checked that 
row l*row2*row6—51 1 110 1 0, 
rowl14*row$4*row6z1100101 1, 
row 2t row 5 00111111. 


(2) Take G(z) = z2:* z 1 and L  GF(2). Again G(B) #0 for B € L (using 
Theorem 8 of Ch. 4). Then l'(L, G) is a [32, 17, 7] irreducible Goppa code, 
with parity check matrix given by Equation (12) (here a is a primitive element 
of GF(2°)): 


1 1 l 1 


G(0) G(1) Gla) G(a*) 


H LO. t 06 e Bu 
G(0) G(1) G(a) G(a™) 
0? 1? a? a” 


G(0) G(I) G(a) G(a™) 
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11 afa? a a 
—|01 o? a? a" a? 
0 1 afa? a? 24 
11011 I 
00000 1 
0001 1-::- I 
00011 0 
00101 1 
01111 1 
00001 0 
=!100100--: 0 
00000 1 
00011 l 
01000 0 
00110 1 
00011-::- 1 
00111 l 
00000 1 


where we have used the table of GF(2) given in Fig. 4.5. The weight 
distribution of ['(L, G) was found (by computer) to be: 


i: 07 8 9 10 I 12 13 14 15 
A;: 1 128 400 800 1903 4072 6876 10360 14420 17448 


16 17 18 19 20 21 22 23 24 2526 
18381 17336 14330 10360 6860 4136 2068 760 250 136 47 


(3) Of course the coefficients of G(z) need not be restricted to 0’s and I's. 
For example we could take G(z) = z? z + a’, where a is a primitive element 
of GF(2*). From Theorem 15 of Ch. 9 G(z) is irreducible over GF(2*), since 
Tia’) = à? + at +a? c a? = 1. Therefore we can take L = GF(2‘), and obtain a 
[16,8, 5] irreducible Goppa code. 


Problem. (3) Find a parity check matrix for this code. 


(4) Irreducible Goppa codes. Consider G(z) 2 z?*z- l1, which is ir- 
reducible over GF(2). The zeros of G(z) lie in GF(2’) and hence by Theorem 8 
of Ch. 4 are in GF(2°), GF(2’),.... Provided m is not a multiple of 3 we can 
take L = GF(2"), and obtain an 


[n22",kz2"—3m,dz7] (for3 4m) (20) 


irreducible Goppa code. When m = 5, the bounds for k and d are exact, as we 
saw in example (2). 
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Alternatively, taking G(z) to be an irreducible cubic over GF(2") we get a 
code with parameters (20) for any m. 

More generally, taking G(z) to be an irreducible polynomial of degree r 
over GF(2") we obtain an 


[n 22", k 22" — rm,d 22r * l] Q1) 


irreducible Goppa code for any r and m. The comparable primitive BCH code 
has parameters 


[n 22"- Ll, k22"— 1- rm, d 22r * 1], (22) 


which (if equality holds for k and d in (21) and (22) has one fewer 
information symbol. 


Problem. (4) Let G(z) have degree r, distinct zeros, coefficients in GF(2°), and 
satisfy G(0) #0, G(1) #0. Let GF(2') be the smallest field which contains all 
the zeros of G(z). Show that we can choose L = GF(2") for any m such that 
Ss | m and (t, m) = 1, and obtain a Goppa code with parameters (21). 

(5) BCH codes. Narrow-sense, primitive BCH codes are a special case of 
Goppa codes: choose G(z) 2 z and L = (1,o,...,o" ] when n = q” — 1 and 
a is a primitive element of GF(q"). Then from Eq. (12), 


1 a’ a gg aa 
H =| 1 aw? a 75 ER QT” 
1 a 1 a? awn 


which becomes the parity check matrix of a BCH code (Equation (19) of Ch. 
7) when a™' is replaced by $. 

To obtain a t-error-correcting binary BCH code we take G(z) — z" and 
L=GFQ"”)*. 

In examples (1) and (2) it turned out that k and d coincided with the 
bounds given in Fig. 12.3. But this is not always so, as the example of BCH 
codes shows. 


Research Problem (12.1). Find the true dimension and minimum distance of a 
Goppa code. 


Problem. (5) Another form for the parity check matrix. Suppose G(z) has no 
multiple zeros, say G(z)=(z—2z,):-:(z—Z,), where z,...,2, are distinct 
elements of GF(2). Show that a E [(L, G) iff Ha’ =0, where H = (H;), 
Hj7l/(z—-oj),forlsisr,1xjzn. 


Remark. Note that in this problem H is a Cauchy matrix (see Problem 7 of 
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Ch. 11). If H is the parity check matrix of a t-error-correcting code then 
every 2t columns of H must be linearly independent, from Theorem 10 of Ch. 
1. Now in classical matrix theory there are two complex matrices with the 
property that every square submatrix is nonsingular. These are the Vander- 
monde and Cauchy matrices - see Lemma 17 of Ch. 4 and Problem 7 of Ch. 
11. The Vandermonde matrix is the basis for the definition of a BCH code (86 
of Ch. 7), and we have just seen that the Cauchy matrix is the basis for 
separable Goppa codes. 


Problem. (6) A Goppa code with G(z) - (z - BY for some Pf is called cu- 
mulative. Show that there is a weight-preserving one-to-one mapping between 
I'(GF(2") — (BV, (z — BY) and the BCH code l'(GF(2")*, z"). 


Example (1) suggests the following problem. 


Problem. (7) (Cordaro and Wagner.) Let €, be that [n, 2, d] binary code with 
the highest d and which corrects the most errors of weight [3(d — 1)] * 1. Set 
r — [5(n + 1)]. Show that d=2r if n «0 or 1 mod3, and.d 2 2r- l if n 22 
mod3; and that a generator matrix for €, can be taken to consist of r 
columns equal to (3), r columns equal to (o), and the remaining columns equal 
to (1). 


$4. Further properties of Goppa codes 


Adding an overall parity check. Let l'(L, G) be a Goppa code over GF(q) of 
length n = q”, with L = GF(q”) = {0, 1, a,...,a@" ?) and G(z) =a polynomial 
of degree r with no zeros in GF(q"). From (12), a = (a(0), a(1),..., a(a"?)) 
is in [F (L, G) iff 


B'a(B) ; 
= = r-l. 23 
as, GO) 0 fori-0,1, r (23) 


F(L, G) may be extended by adding an overall parity check a(o) given by 





a(9)-- Y a(B) 


Be€GF(q") 


or 


> a(B)=0. (24) 


BEGF(g™)U{>} 


With the convention that !/o=0, the range of summation in (23) can be 
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changed to GF(q")U {œ}. Finally, combining (23) and (24), we obtain the 
result that â = (a(0), a(1),..., a(a" ?), a()) is in the extended Goppa code iff 


B'a(B) _ = 
B€GF(" Ufo} G(B) 0 fori=0,1,....7 (25) 


The extended code will be denoted by f(L, G). 


Problem. (8)(a) Let m be the permutation of GF(q")U {œ} defined by 


ay+b 
cy+ d’ 





m:y> (26) 
where a, b, c, d E€ GF(q") satisfy ad —bc#0, and y € GF(q”) U {~}. First 
check that this is a permutation of GF(q")U {œ}. Then show that 7 sends the 
extended Goppa code /'(L, G) into the equivalent code /'(L. G;), where 


21) 
cz+d] 





Gi(z) = (cz + d)'G ( 


(b) Suppose G(z) is such that 





GG) eez + aya (£77) 


cz+d 


for some e € GF(q")*. Show that 7 fixes F(L, G). 


Goppa codes and Mattson-Solomon polynomials. In the special case in which 
L consists of all n roots of unity, a nice description of Goppa codes can be 
given in terms of Mattson-Solomon polynomials (see Cor. 8). Only the binary 
case will be considered. 

Recall from §6 of Ch. 8 that if a(x) = Eo ax', a; = 0 or 1, is any polynomi- 
al, its Mattson-Solomon polynomial A(z) is given by 


A(z) = >> A az, (27) 


i=0 


where A; = a(a') and a € GF(2") is a primitive n™ root of unity. The inverse 
operation is S 
a(x) = > A(a')x'. (28) 
i20 


If L—-(1,0,07,...,o" '), there is a relation between 


n-t 


R.(2)= > — 


&z+a' 





of Equation (8) and A(z). This is given by the following theorem. We use the 
notation that if f(y) is any polynomial, [f(y)], denotes the remainder when 
f(y) is divided by y" — 1. 
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Theorem 7. 
A(Z) = [2(2" + I) Ra(z)]as (29) 
Ra =S 207. (30) 


Proof. Write R,(z) = F(z)/(z" + 1), where 
F2 S a[[G +o’). 


Then 


z(z" + DR,(z) = zF(z) = Y az |] (z * a!). 


Also 
A(z)- Y a. Y, a "zl. 
Then (29) will follow if we show that 


Bi (z+ an) = $ a "zi, 


jei j-0 


n-i 
z[[(z+a)+z"+1=9 oz. 
jai j-0 


To see that this holds, just multiply both sides by z+ a^. Finally (30) follows 
from (28). Q.E.D. 


Corollary 8. Let l'(L, G) be a binary Goppa code with L ^(1,0,...,a"^'), 
where a € GF(2") is a primitive n root of unity. Then 


I'(L, G) = {a(x): [z""'A(z)], = 0 mod G(z)). (31) 


Proof. As in the proof of the theorem, write R,(z) = F(z)/(z" + 1). Since z^ + 1 
and G(z) are relatively prime, a(x) € I'(L, G) iff F(z)=0 mod G(z). From 
(29), A(z) =[zF(z)], and so [z^ ' A(z)], = F(z). Q.E.D. 


Note that if wt(a(x)) is even, Ao = 0 and [z" ' A(z)], = A(z)/z. Thus in this 
case zG(z) divides A(z). 


Example. Consider the binary Goppa code I'(L, G) where G = z°+z+1, L= 








Ch. 12. §4 Further properties of Goppa codes 349 


{l,a,...,a@'} and a is a primitive element of GF(2*). The parity check matrix 
(12) is 





This is a poor code, but a good illustration of the Corollary. The Mattson- 
Solomon polynomials corresponding to the codewords u, u, and u, are 
respectively 


L+z4+27+ 244+ 2%, 
z+z?+z +--+", 
1+ w?z + wz? + wz? + w?z* + W275 + w?z$ + w?z? 
+ wz? + wz? + w?z'? + wz + wz?  w?z" + wz", 
where w stands for a^. Applying the Corollary to the first of these we find 
[z^A(2)2 1432 t 2 z/ +2" 


which is indeed a multiple of z*+ z+ 1. The reader is invited to verify the 
Corollary for the other two polynomials. 


Corollary 9. Let € be a binary Goppa code T'(L, G) with L-(1,o,...,a"'), 
where a € GF(2”) is a primitive n" root of unity. If € is cyclic then € is a 
BCH code and G(2z) » z" for some r. 
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Proof: Suppose G(z) # yz! for any y € GF(2") and integer |. Then G(z) has a 
zero Bx 0 in some field GF(2°). Let a(x) be a nonzero codeword in € with 
even weight. From Corollary 8, [z" ' A(z)], = A(z)/z is a multiple of G(z), 
hence A() = 0. Since € is cyclic, by Theorem 22 (iv) of Ch. 8, A(a'B) = 0 for 
i—-0,...,n-— I. Thus A(z) has at least n distinct zeros, so deg A(z) 2n, 
whick is impossible. Q.E.D. 


Long Goppa codes are good. It was shown in $82 that long alternant codes 
meet the Gilbert-Varshamov bound. In fact this is true for a much smaller 
class of codes, as shown by the following problem. 


Problem. (9)(a) Show that there exists a Goppa code I'(L, G) over GF(q) with 
G (z) * an irreducible polynomial over GF(q") of degree r and L = GF(q"), 
having parameters [n = q”, k 2 n — rm, d], provided 


d-l p m 1 ree 
> (= ta-o [o )<} q"(1-(r- Dq ^). (32) 
[Hint: From Theorem 15 of Ch. 4, show that there are I,»(r)> 
(ir\(q™ — (r — 1)4""?) irreducible polynomials of degree r over GF(q").] 

(b) Hence show that there exist long irreducible Goppa codes which meet 
the Gilbert-Varshamov bound. 





But this doesn't say which G(z) is best. 


Research Problem (12.2). Which Goppa polynomial G(z) gives the highest 
minimum distance (or the lowest redundancy)? 


*$5. Extended double-error-correcting Goppa codes are cyclic 


In this section certain extended binary double-error-correcting Goppa 
codes are shown to be cyclic (generalizing Ex. 1 of $3). To do this we find a 
group of permutations which preserve the code, and then show that one of 
these permutations consists of a single cycle. 

The codes to be considered have Goppa polynomial G(z) which is a 
quadratic with distinct roots, and L consists of all elements of GF(2") that are 
not zeros of G(z). Thus I'(L, G) is a double-error-correcting code. There are 
two cases, depending on whether or not G(z) is irreducible. From Theorem 16 
and Problem 6 of Ch. 9, G(z) can be taken to be z^ z * B, where B is an 
element of GF(2") with trace 1 if G(z) is irreducible over GF(2"), or with 
trace 0 if G(z) is reducible over GF(2”). Let f(L, G) be the extended code 
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obtained by adding an overall parity check to [(L,G), with coordinates 
labeled by L U (c). Thus f(L, G) has length 2" +1 if G(z) is irreducible, or 
length 2" — 1 if G(z) is reducible. 

The irreducible case is considered first. 


Theorem 10. Let f(L,G) be the [2" +1,2” —2m, 6] extended double-error- 
correcting Goppa code just defined, where G(z) is irreducible over GF(2"). 
Then f(L,G) is fixed by the group Gr consisting of the 2m(2" +1) per- 
mutations 


Gr = (C', C'B(n), CD, C'DB (n), for i-0,..., m-] and all n € GF(27)) 


(33) 
where 
+B+ 
B): y Bt 
n 
C:y> y' * B. (34) 
D: y>y+!, 


act on all y in GF(2”)U {œ}. (More about Gr in the next theorem.) 


Proof. (i) First we show that the permutations B(»), C and D preserve 
f(L, G). (Note from Problem 8a that B(n) is a permutation, for T,,(n’+ 7) = 
0, T,,(B) = 1 hence n?+ + B* 0.) From Problem 8b, B(9) and D preserve: ff. 
To show that C preserves f, let a = (a(0), a(1),..., a(»)) be a codeword of Î 
and let Ca = (a'(0), a'(1),...,a'(0)), where diy = a(y? 4 B). Then from (25) 
Cac f iff 
yay) . lI 
y€GFQ" Ufo} y + Y T B y a " b B p 


Set a = y? + B, or y = V(a + B) (recall from $6 of Ch. 4 that every element of 
GF(2") has a unique square root). Then (35) becomes 


(a + B)^a(a) 
aEGFR™ ula} X + Va + By 
Squaring this we obtain 


(a + B)a(a) 
a’+a+B ` 


a€GFQ" Ufo} 


which from the definition of f, Equation (25), is indeed zero for i =0, 1, 2. 
Therefore C preserves [. (ii) Secondly we show that the group generated by 
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C, D and all B(n) is given by (33). Any permutation of the form’ 


ay+b 
y? , 
Cy * d 





a, b, c, d E GF2"), ad - bc Q, (36) 


can be represented by the invertible matrix (25). For example B(n) and D are 


represented by 
n B+ " i) 
(i n ang (i i: 


Of course (25) and e(25, for any e in GF(2")*, represent the same permutation. 
If permutations 7,, 7; are represented by the matrices M,, Ma, then 7,772 
(which means first apply 7, then m2) is represented by the matrix-product 
M.M,. For example DB(n) is represented by 


(^27) 6)7 (24) an 

Note that 
B(ny-D?- I, (38) 
C":y2ytT,(B)-7y*1l (39) 


so C" = D and C?" =]. 
Let. H be the group generated by D and all B(n). From (38) and 


B(6)B(n)= DB (em iC n8 


B(n)D = DB(1 +n). 
it follows that 
H ={I, B(n), D, DB(n); for all n €GF2”)}, 


and has order 2(2" + 1). It remains to show that the group generated by H and 
C is given by (33). This follows from 


B(n)C = CB(n’+ B). 
DC = CD. Q.E.D. 


Problem. (10) Show that CB(q)C ' = B((n + B)" ). 


We shall now prove that f(L, G) is equivalent to a cyclic code. 


‘The group of all such permutations is called PGL;(2") (cf. §5 of Ch. 16). 
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Theorem 11. (Berlekamp and Moreno.) Let f(L, G) be an extended irreducible 
double-error-correcting Goppa code as in Theorem 10. The group Gr given by 
(33), which fixes f(L, G), contains a permutation DB(no) which is a single 
cycle consisting of all the elements of GF(2")U {œ}. Thus the coordinate 
places can be rearranged so that ÎL, G) is cyclic. 


Proof. The elements of GF(2") U {œ} will be represented by nonzero vectors 
() with r,s € GF(2"), where (:) represents the element r/s if s#0, (6) re- 
presents ©. Also (5) and e(:) represent the same element, for e € GF(2")*. 
The permutation DB(») given by (37) sends 


3 
n *Bn*B ,... 


o => 24 — TETTPM. 


or equivalently, using the vector notation, 
l n B Ao) = (7) n B Xn- n’+B n’+Bn+B 
()- 1289/9] M (i 1-9 (i =( l (rer Fo 


We shall find an no such that DB(no) consists of a single cycle, i.e. such that 
the least integer n for which 


( l P.) () =k (0): for some k € GF(2"), 


is 2" + 1. The eigenvalues of (7 ,2,) are A and 1+A, where 
A(1-A)— q^ m4 f. 


The eigenvectors are given by 


(abe CT)-e eT”) 


(7 Gals express 


Let us write (o) in terms of the eigenvectors as 


WEEE E) 


Carre n sey 


Now let n be the smallest integer such that 


i l AN x (o) 


Therefore 
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for some k € GF(2”). Then 
(A * D'(A-m)* A'(A* n+D=k, (40) 
(A +1) +A" =0. (41) 
Substituting (41) into (40) we obtain 
k — A". (42) 
Let £ € GF(2"), ££ GF(2"). Then 
CELINE 
so by Theorem 8 of Ch. 4, 
2" +E=a (say)€&€ GFQ"). 
Now suppose £ is a primitive (2"+1)" root of unity. Then £"—£'', so 
€'+&+a=0, or 
£4 a&+1=0. 
We choose no by setting A = é/a, thus 
no + not B=A(1+A)= (2+ a£)a? = la. (43) 


Now A = ¿a is a zero of x?+x+1/a’, so T,,(1/a@’) = 1 by Theorem 15 of Ch. 
9. Hence T,,(8 + 1/o7) 2 0, and from (43) and Theorem 15 of Ch. 9, mo is in 


GF(2"). 
From (42), we must have A" = (Z/a)" € GF(2"), i.e. 
€ "T M L g2- _ 
(Eye = ey = (44) 


But 2" — 1 and 2" + 1 are relatively prime, so £^"! is also a primitive (2" + 1)" 


root of unity. Therefore n 22" +1 is the smallest number for which (44) is 
possible, and so DB(no) consists of a single cycle. (Note that Equation (41) 
holds with n —2" +1.) G.E.D. 


From $4 of Ch. 9, any cyclic code of length 2" + 1 is reversible (as the code 
(19) is). So once we know from Theorem 11 that f(L, G) is cyclic, it follows 
that the automorphism group of PL, G) contains the reversing permutation R 
and the permutation o; (which sends i to 2i if the coordinates are labeled 
properly) - see $4 of Ch. 9. By Problem 4 of Ch. 9, the automorphism group 
of FL, G) has order at least 2m(2™ + 1). In fact R and o, are contained in Gr, 
Problem 11, so Gr is exactly the group generated by the cyclic permutation 
DB(mo) and or. 

Of course the automorphism group of FL, G) may be larger than Gr, as 
shown by the examples below. 
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Problem. (11) (i) Show that the permutation 


YT ysBsi P 


is in Gr, fixes the point (i), and sends (DB (mno)}¥ (i) to (DB(m))* (i). So this is 
the permutation o;. 


[Hint: Write () = (A + 3)01)* A +n + DC?) 
(ii) Show that R = o7(DB(9) ' = DB(»9D is the reversing permutation. 


Examples. To illustrate Theorems 10 and 11, consider f(L, G) = the [9, 2, 6] 
code (19) of example 1 in $3. Some of the permutations fixing this code are 


C:y y!'*l, which is ()(01)(aa5a*a?a?a), 
D:yo yl, which is (»)(01)(ao?)(o?a5)(o*a^), 
and 


2 
DB(n = DB(a?): ys YT. which is (1a ^a502?a?0aa?). 





As shown in (19), arranging the coordinate places in the order 1, o^, o5, o, a’, 
až, 0, a, a? does make the code cyclic. For this code o; (Equation (45)) is 
y 2 l/y?, which is (1)(a*a*5a?a?3aa?)(90). 

From Theorem 11 the code is fixed by the 54 permutations of Gr. In this 
case the code is in fact fixed by many additional permutations. 


Problem. (12) Show that the automorphism group of the [9, 2, 6] code (19) has 
order 1296. 


Applying Theorem 11 to example 3 of $3 we obtain a [17,8, 6] cyclic code, 
which is in fact a quadratic residue code (see Ch. 16) whose automorphism 
group has order 2448. 

As a third example, take G(z) = z^ 4 z  l and L = GF(2). Then P(L,G) isa 
[33, 22, 6] cyclic code with generator polynomial 1+ x? 4 x?^ 4 x54 x? +x", 


The case when G(z) is reducible. Similar results hold if G(z) is reducible over 
GF(2"), say G(z)=27+2z+B=(z+A)(z+A+1), where B, AG GFR") and 
T.(B) 2 0. Now we take L = GF(27) - (4,A +1}. The 2"— I coordinates of 
ÉL, G) are labeled by LU (c). For 2x 4, n#A+1 we again define B(), C 
and D by (34). These are indeed permutations of LU {œ}, for B(n) and D 
interchange A and A +1, and C fixes A and A +1. Then we have: 


Theorem 12. If G(z) is reducible then the [2" — 1,2" —2m —2,6] extended 
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Goppa code [(L,G) is fixed by the group G} consisting of the 2m(2"—1) 
permutations 
(C', C'B(n), CD, C'DB(n); fori =0,...,m—1 

and all n € GF(2")—{A, A +1} (46) 


acting on L U {œ}. This group contains a permutation DB (no) which is a single 
cycle consisting of «ll the elements of L U {œ}. Hence I'(L, G) is equivalent to a 
cyclic code. 


Proof. The first statement is proved as in Theorem 10. Note that now C" = I. 
To prove the second statement we will show that GF(2") contains an element 
A = Xo (say) that does not satisfy the equation 


A* - (A -17* =0 (47) 
for any d «2" — 1. Then choose no € GF(2") so that 
not Not B = Ac(Ào * 1). 
This can be done since T,,(B) = Tm(AS+ Ao) = 0. Then n = 2” — 1 is the smal- 


lest number for which Equations (41) and (42) hold, which completes the 
proof. 

It remains to show that there is a A in GF(2") such that (47) does not hold 
for any d « 2" — 1. The minimum d for which (47) holds must be a divisor of 
2" — ]. For 


= (A+ 1)""' (since A E GFQ") 


and so from (47), A? = (A + 1)° where ô = gcd (d, 2” — 1). But d is minimal so 
8 — d and d |2" — 1. Let F, be the number of zeros of A^ + (A + 1)4 in GF(2”), 
where d |2" — 1. Now A* - (A + 1^ divides A?" - (A + 1)", and the latter 
equation has 2" —2 zeros in GF(2") (namely all of GF(2") except 0 and 1). 
Thus all the zeros of A* +(A t 1)" are in GF(2”), i.e. F,- d—- l. 

The number of zeros of A7”™'+ (A + 1^! which are not zeros of (47) for 


some d «2"- lis m] 
> Fa. ( d ) (48) 


d|2"7-1 





where u is the Mobius function defined in Ch. 4. For if £ is such a zero it will 
be counted in the term d = 2" — 1 and no other term, and so is counted once. 
On the other hand, if £ is a zero of A" + (A + 1)" for d <2” — 1, it is also a zero 
of AP +(A +1)? whenever d | D, i.e. the number of times it is counted in the 


sum is 
27—] 27" —1 
p ( ) = AF) where D = dd’, 
PN D d'| DA os dd 
0 by Theorem 14 of Ch. 4. 
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Finally (48) is equal to 


CE a a a x) 











d127-1 d|27-1 
=@(2"-—1)-—0 (by Theorem 14 of Ch. 4) 
70, as required. Q.E.D. 


Problem. (13) Show that this code is also reversible and the reversing 
permutation may be taken to be B(no). 


Hence in this case f(L, G) is fixed by the group of order 2m(2" — 1) 
generated by the cyclic permutation DB(no), o; (Equation (45)) and the 
reversing permutation B(no). (See Problem 4 of Ch. 9.) 


Research Problem (12.3). What other extended Goppa codes are cyclic? 


*§6. Generalized Srivastava codes 


Another interesting class of alternant codes are the generalized Srivastava 
codes. (See Fig. 12.4.) 


The code is defined by the parity check matrix (51), where a,.... 
Qn, Wi,..., Ws are n + s distinct elements of GF(q"), and 2Z,,..., Zn 
are nonzero elements of GF(q"). It is an [n, k > n — mst, d > st +1) 


code over GF(q), and is an alternant code. The original Srivastava 
codes are the case t = 1, z; = at for some p. 





Fig. 12.4. Properties of generalized Srivastava code. 








Definition. In the parity check matrix (3) for the alternant code (a, y), 


suppose r= st and let a,..., @, Wi,..., w, be n +s distinct elements of 
GF(q4"), 2:,..., Z, be nonzero elements of GF(q"). Also set 
Jœ- w) 
Ba- boa xX) = ea I21l.... S,k-l,...,t (49) 


(note that this is a polynomial in x), and 
Zi 


=— o jed,.gn (50) 
TI (a - w) 


yi 
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so that 


Zi 
YLa-ve+e (au) = (a- wy 


The resulting code is called a generalized Srivastava code. Thus the parity 
check matrix for a generalized Srivastava code is 











H, 
H, 
H=|: (51) 
H, 
where Zn 2 Zn 
ATW aW è Q.— Wi 
Zi 22 Zn 
(m — wy (anw) (an — wy 
Hisce era LE d (52) 
(air w) (a2 — w) (an — wi) 


for | =1,...,s. 


The original Srivastava codes are the special case t = 1, z, = a} for some y, 
and have parity check matrix 


ar an 
a,- Wi An — Wi 
Ties) | ascsgad E cee i (53) 
ot 2.02 008 


a,— W: ` Qn — W: 
Since there are s w;'s, there can be at most q" — s a,’s, so the length of a 
generalized Srivastava code is at most q” — s. 
If o,...,o, are chosen to be all the elements of GF(q") except the w,’s, 
then n 2 q"— s and the codes are called primitive (by analogy with BCH 


codes). 
Since it is an alternant code, a generalized Srivastava code has k = n — mst 


and d > st +1. 


Example. Consider the generalized binary Srivastava code with m = 6, n =8, 
r=2,s=1,t=2, &ı,...,&= the elements of GF(2°) lying in GF(25), i.e. 


fan, seg as} = {0, 1, a’, a’, a”, a”, a^, a^), 
where o is a primitive element of GF(25). Also 


w17a,2,—1 fori=1,...,8. 
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Therefore 

















H= (54) 


(0 — ay (1— ay (a? — ay (a 5— ay. (a^ — ay 
The second row is the square of the first and is redundant. Expanding the first 


row in powers of a we find from Fig. 4.5 


ü- (55) 


eK OCC OCO™ 


The codewords are 


This is an [8,2,5] code which is in fact equivalent to the Goppa code of 
example 1 of $3. 


Problems. (14) Show that a Srivastava code (with t = 1) is a Goppa code; 
hence d > 2s + 1 for binary Srivastava codes. 

(15) Show that the binary Srivastava code (53) with m = 4, s =2, n= 14, 
w, =0, wo= 1, {ai,..., au) = GF(2) - (0, 1) and p =0 is a [14, 6, 5] code, and 
that the dual code has minimum distance 4. 

(16) Show that the binary Srivastava code with m —4, s=2, n= 13, 
w =a, w =a”, {a ..., œn} = GF2)- (0,0 ', a} and u = 1 is a (13, 5, 5] 
code, and that the dual code has minimum distance d’ = 5. 

(17) If m =1, show that a generalized Srivastava code is MDS (Ch. 10). 
These are sometimes called Gabidulin codes. 

(18) Let € be a binary primitive generalized Srivastava code with z; = 1 for 
all i, and n -2"— s. (i) If s— l, show that € is unique and is a primitive 
narrow-sense BCH code. (ii) If s — 2, again show that € is unique and that the 
extended code € is fixed by a transitive permutation group. 

(19) Prove the following properties of the binary non-primitive generalized 
Srivastava code with parameters: 

m — mmo», with m, 1, m; l, 
S= Mh, i 
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di- Um, 


w = a ,f=1,..., m2, where a is a primitive element of GF(2"), 
a1,...,@, distinct elements of GF(2™) C GF”), 


nx2", 
Z,,...,2, nonzero elements of GF(2™). 
(i) The rows of H;,..., H, in (51) are redundant. Hence H can be taken 


to be the RHS of (52) with w, replaced by o. 

(ii) (54) is a code of this type. 

(iii) k z n — mt and d= mat * 1. 

(iv) If all z = 1 then k zn — mt/2. 

(v) If all z, = 1 and m; — 2, then the extended code is fixed by a transitive 
permutation group. 


*$7. Chien-Choy generalized BCH codes 


These are another special class of alternant codes, and are defined in terms 
of two polynomials P(z) and G(z). (See Fig. 12.5) Let n be relatively prime to 
q, and let GF(q") be the smallest extension field of GF(q) which contains all 
n" roots of unity. 












GBCH(P, G) consists of all a(x) such that (56) holds, and is an 
[rn kzn-rm,dzr-1] code over GF(q), where P(z) and G(z) 
are polynomials over GF(q"), deg P &n—1, r-degGn- |l, 
which are relatively prime to z" — 1. A parity check matrix is given 
by (57). These are a -special class of alternant codes. 





Fig. 12.5. Properties of Chien-Choy generalized BCH code GBCH(P, G). 


Definition. The Chien-Choy generalized BCH code of length n over GF(q) 
with associated polynomials P(z) and G(z)- abbreviated GBCH(P, G)- is 
defined as follows. Let P(z) and G(z) be polynomials with coefficients from 
GF(q") with deg P(z) &n—1 and r= deg G(z) <n — 1, which are relatively 
prime to z" —1. Then GBCH(P, G) consists of all a(x) with coefficients in 
GF(q) and degree =n — 1 for which the MS polynomial A(z) satisfies 


[AG)P(2], 90 mod G(z). (56) 


For example, in the binary case GBCH(z"^', G) is the Goppa code ['(L, G) 
where L consists of the n" roots of unity, by Corollary 8. 


The parity check matrix for GBCH(P, G) is found as follows. 
a(x) € GBCH(P, G) 


iff 3U(z) such that [A(z)P(z)], = U(z)G(z) and deg U(z) &n—-1-r 
iff (from Theorem 22, Ch. 8) 3 u(x) with coefficients from GF(q") such 
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that a(x) * p(x) = u(x) * g(x) and U,=---= U,=0, where u(x), pix) 
and g(x) are obtained from U(z), P(z) and G(z) by the inverse 
mapping Equation (9) of Ch. 8. 
iff 3454...,u, , € GF(q") such that ap; = ug; (i =0,...,n—1) and U,= 
-= U,=0. (Note that p:# 0, g:# 0 by hypothesis.) 


iff y 20s a'(—-U)-0 for j=1,. 


In other words a —(a5,...,a, 9€ GBCH(P, G) iff Ha’ = 0, where 


Polgo Pialg, Pra7/g2 >>> pua" [gua 
Pol go piclgi paa lg; EN Parva" / gn) 

H a) ec Be eee vis Bee eS e rhe ese mus 
Dolgo Pia'lg, poo go >+ puo" "|n, 
11 1 œl Pol go 0 

_ |la a? gn pialg, 

Een or EE NOE COPS Tek poa ga : 
l oa"! sue a C 9-0 0 t Pare" [gs 


(57) 


Thus H is a parity check matrix for the code. This shows that GBCH(P, G) is 
an alternant code with a,...,a, chosen to be the n" roots of unity and 
y; = p;-.a'"'/g,-.. Therefore GBCH(P, G) has parameters [n, k > n — rm, d > 
r+1] where r= deg G(z). 

If P(z2) 2 27**?, G(z) = z^! then GBCH(P, G) is a BCH code, for it is 
easily seen that the parity check matrix is then 


2 -1 
la? a2? e.g 070b 
l a^?! a e. gb 
kalaa ka ee ee eE E aE a ; 
1 o^ **7 o e *57» kea o C757» 


which is the parity check matrix of a BCH code (Equation 19 of Ch. 7). 

If P(2)22*'*?42?***? ((z2)—2?" then GBCH(P, G) contains the in- 
tersection of the corresponding BCH codes, and in fact may equal this 
intersection. 

For example, let n = 15 and consider P (z) = z, G(z) = z?. The parity check 
matrix is 


Hw 1 1 1 1 1 1 1 1 1 1 1 iul 
la a? a? a a? a5 a? a* a? a" a" a? a? a“ 


and defines the [15, 10, 4] BCH code with idempotent 8, + 0s + 0; (see Problem 
7, Ch. 8 for the idempotents of block length 15). If P(z) - zi, G(z) - z? the 
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parity check matrix is 


0 4 3 7 1 
l aa? a? a a? a? a? a? a$ a” a^ a? a’ a! 
S 0 0 0 0 S 10 fe 
laat" l ,aàja?1 o) al o? a"^1 aa 


This determines the [15,9] BCH code with idempotent 8» + 03+ 0;. 
GBCH(z + 2°, 2°) has parity check matrix. 


1 a a? a! a‘ aa’ ae a? a? a> a? a‘ a? a"? 
[ 1 a? a? a a’ 1 a^ a" a’ E 

When this is written in binary it is seen that one row can be discarded; hence 
it defines a code of dimension 8 which is in fact the cyclic code with 
idempotent 6;-- 63, i.e. with zeros 1, a, a, a4, af, až, a". 

Since this example is a cyclic code which is not a BCH code it follows 
from Corollary 9 that the class of alternant codes includes other codes besides 
Goppa codes. 

Further properties and examples are given in the following problems, and 


in [286]. 


Problems. (20) Show that a(x) € GBCH(P,G) iff deg [A(z)P(z)/G(z)], « 
n-1-r. 

(21) Show that GBCH(P,G)=GBCH(P*,z’), where P(z)*¥= 
(z’P(z)/G(z)], and r = deg G(z). Thus these codes could have been defined 
just in terms of P *, without introducing G. However, the definition in terms 
of two polynomials simplifies the analysis. 


$8. The Euclidean algorithm 


A crucial step in decoding alternant codes uses the Euclidean algorithm. 
This is a simple and straightforward algorithm for finding the greatest com- 
mon divisor (g.c.d.) of two integers or polynomials, or for finding the con- 
tinued fraction expansion of a real number. 

We shall describe the algorithm as it applies to polynomials, since that is 
the case of greatest interest to us. Only trivial changes are needed to get the 
algorithm for integers. In this section the coefficients of the polynomials can 
belong to any field. If f(z) and g(z) are polynomials, by a g.c.d. of f(z) and 
g(z) we mean a polynomial of highest degree which divides both f(z) and 
g(z). Any constant multiple of such a polynomial is also a g.c.d. of f(z) and 


g(z). 


Theorem 13. (Euclidean Algorithm for polynomials.) Given polynomials r_,(z) 
and r,(z), with deg ro «€ deg r_,, we make repeated divisions to obtain the series of 
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equations 
r_(Z) = qi(Z)r(z) + ri(z), deg r, < deg ro, 
ro(z) = g2o(z)ni(z)+ rz), deg r< degr, 


rj-Xz) = qi(z)ri:(z) * r(z), deg r; < deg 7-1, 
rji-i(2) = qj. (z)r(z). 


Then the last nonzero remainder r,(z) in the division process is a g.c.d. of 
r_(z) and r(z). 


Proof. Clearly rj(z) divides 7-,(z), hence divides r;.;(2),..., hence divides 
both ro(z) and r_,(z). Conversely, if h(z) divides r_,(z) and r(z), then h(z) 
divides r,(z),...,7(z). Therefore r;(z) is a g.c.d. of r_.(z) and r«(z). Q.E.D. 


Example. We find the g.c.d. of r_.(z) = z^* z^  z* 1 and r(z) 2 z?* 1 over 
the field GF(2): 


z442374274+1=(z 4+ 12°74 1)4+ (27+2), 
z?’+1=(z+1)X(z?+2z)+(z+1), 
z?+z=z(z+1). 


Hence r.(z)= z+ 1 is the g.c.d. of r_.(z) and ra(z). 
As a by-product of the Euclidean algorithm we obtain the following useful 


result. 


Theorem 14. Let r_,(z) and ro(z) be polynomials with deg ro « deg r_,, and with 
g.c.d. h(z). Then there exist polynomials U (z) and V(z) such that 


U(z)r_\(z) + V(z)r«(z) = h(z), (58) 
where deg U and deg V are less than deg r-,. 


Proof. Let the polynomials U;(z) and V;(z) be defined by 
U (2) 7 0, Uz) = 1, 
V_(z)=1, — Ve(z)-0, (59) 
and 
U;(z) = q(z) Ui-i(2) + Ui.x(z), 
Vi(z) = qi(z) Via(2) + Vi-x(2). (60) 
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Then 
Ico bebes we collar 
uq] € 
Also 


E s albo 
Esser od sod ed 
Meca een E sod 
lor m] (62) 


The determinant of the RHS of (61) is (— I). Hence from (61) and (62) 


ra(z)] _ cU Via(z) — Ui(z)][ra(2) 
eae Deg U,(z) js) e 
In particular 
r(z) 7 (- W[- V;()ra(2) + Uj(z)r«z)], (64) 


which is (58). Also 
deg U; = V deg qu, 
k=l 
deg r;-ı = deg r-ı— > deg q., 
k=t 


deg U; = deg r., — deg r., < deg r-y, 
and similarly for V; Q.E.D. 


Corollary 15. If r_,(z) and ro(z) are relatively prime then there exist polynomi- 
als U(z) and V(z) such that 


U (z)r-(z) + V(z)r«(z) = 1. 


Of course similar results hold for integers. 
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Example. Continuing the previous example, we find 


U =0, U,= 1, U,=2z+1, U2 = 2’, 
V_,=1, V, — 0, Vi-1, V.2=z+1 


and Equation (58) states that 
Z+1=(z24+ 1I(2442° 4274+ 1) 4+ 22? 4+ 1). 


Problem. (22) Show that U;(z) and V;(z) are relatively prime. 


$9. Decoding alternant codes 


In this section we describe an efficient decoding algorithm for alternant 
codes which makes use of the Euclidean algorithm of the previous section. 
The decoding is in 3 stages, as for BCH codes: Stage I, find the syndrome. 
Stage II, find the error locator and error evaluator polynomials using the 
Euclidean algorithm. Stage III, find the locations and values of the errors, and 
correct them. Since this decoding method applies equally well to BCH codes, 
it completes the gap left in the decoding of BCH codes in §6 of ch. 9. 

Let x be an alternant code #(a, y) over GF(q) with parity check matrix 
H = XY given by Equation (2), and having minimum distance d 7 r * 1, 
where r is even. Suppose t < 7/2 errors have occurred, in locations 


X= Qn.. eX = Qi, 
with error values 
Yi-ai..., Y, = 4i, 


as in $6 of Ch. 9. 
Stage I of the decoding is to find the syndrome. This is given by 


1 1 1 y 0 0 
O1 Q2 An ya 0 
2 2 2 
ay a2 Qn ys 1 
(MEE 
aio a, | 10 ya 
So 
S, 
= (say), 
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where 


S, = 2, akay, 
v-l 
= > XEY. Yin (65) 
v-t 


for w=0,...,r—1. 
Stage II uses the syndrome to obtain the error locator polynomial 


o(z)= n (1— X;z) 
i=l 
=Yoz', n=l, (66) 
i=0 
and the error evaluator polynomial 


«o(z) = > Y.y, n (| X (67) 


Bv 


(Note that this is a slightly different definition from that in Equation (17) of 
Ch. 8). These polynomials are related by 


w(Z) . r 
"P S(z) mod z (68) 


where 
r—1 
S(z)= Y Sz". 
p=0 


(The proof of (68) is the same as that of Theorem 25 of Ch. 8.) 

Thus the goal of Stage II is, given S(z), to find o(z) and w(z) such that (68) 
holds and such that the degree of o(z) is as small as possible. There certainly 
is a solution to (68), since it is assumed that x r/2 errors occurred. Equation 
(68) is called the key equation of the decoding process. 

The key equation can be solved by the Euclidean algorithm of the previous 
section. In fact, Equation (63) implies 


r(z) = (— 1)'U,(z)ro(z) (mod r_.(z)). (69) 


Thus we can use the following algorithm for Stage II. 


Algorithm. Set 
r_(z)=2', ro(z) = S(z), 
and proceed with the Euclidean algorithm until reaching an r.(z) such that 


deg n(z)22r and degr,(z)xir- |l. (70) 
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Then the error locator and evaluator polynomials are given by 


o(z) = 6U,(z) (71) 
w(z) = (~ D'ér.(z) (72) 
where ô is a constant chosen to make o(0)= 1, and satisfy 
w(z)= a(z)S(z) mod z”, (73) 
deg o(z) € ir, (74) 
deg w(z) & ir - 1. (75) 


Proof. (73) follows from (69). Also dégw=degn<3r—1 and dego- 
deg U, = deg r.; — deg ry., & àr from (70). Q.E.D. 


That these are the correct values of o(z) and w(z) follows from 


Theorem 16. The polynomials o(z) and w(z) given by (71), (72) are the unique 
solution to (73) with a(0) = 1, deg ø & ir, deg w «ir — 1, and deg ø as small as 
possible. 


Proof. (1) We first show that if there are two solutions to (73), say o,w and 
o’,w', with deg ø «ir, deg w «ir — 1, dego' «ir, deg w’ € ir — 1, then 
o= puo and w=ypo' 
for some polynomial yw. In fact, 
w — cS mod z', w' = g'S mod 7’, 
wo' = w'o mod 2’. 


But the degree of each side of this congruence is less than r. Therefore 
wo’ = w'o, and 


(2) If the solution ø,w given by (71), (72) is not that for which dego is 
smallest, then from 1, 
o=po' and w=ypo' 
for some u, where a’,w’ is also a solution. Then from (63) and (72), 
w(z) = (- D'ón.(z) 
= — bV,(z)z’ + o(z)S(z), 
p(z)w'(z) = — 8V.()z' + w(z)o"(z)S(Z). (76) 
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Also from (73), for some w(z), 
w'(z) = o'(z)S(z) + y(z)z'. (77) 


(76) and (77) imply &(z)| V.(z). But a(z) also divides o(z) = 6U,(z). Since 
U,(z) and V,(z) are relatively prime (Problem 22), u(z) is a constant. Q.E.D. 


Stage III is as for BCH codes: the error locators. X; are found as the 
reciprocals of the roots of o(z), and then the error values are given by 


e(X,^) 
x, I -XX 


vA 


Y, = (78) 


Notes on Chapter 12 


82. Alternant codes were defined and studied by Helgert in a series of papers 
[631—635]. Many important results are given by Delsarte [359]. The name 
alternant code is based on the fact that a matrix or determinant of the form 


folXo) fixo ee fiaQo) 


fxr) filx,-1) nd fa) 


is called an alternant - see for example Muir [974, pp. 341, 346]. 
Other generalizations of Goppa codes have been given by Mandelbaum 
[903] and Tzeng and Zimmerman [1356]. 


83. Goppa introduced his codes in [536] and [537]. See also the short survey 
article [125] by Berlekamp. §§3,4 are partly based on these three papers. 
Further properties are given by Goppa [538-539] and Sugiyama et al. [1291]; 
these papers describe a number of other good codes that can be constructed 
using Goppa codes as building blocks. For the reference to Gabidulin codes see 
Goppa [537]. 


85. Theorem 11 is due to Berlekamp and Moreno [130], and Theorem 12 to 
Tzeng and Zimmerman [1355]. Other classes of extended Goppa codes which 
are cyclic have been given in [1355] and in Tzeng and Yu [1353]. However, 
Research Problem 12.3 remains unsolved. For PGL, (q) and related groups 
see for example Huppert [676, p. 177]. 


$6. The first published reference to Srivastava codes is Berlekamp 
[113, 815.1]. These codes and generalized Srivastava codes have been studied 
by Helgert [631-635] and this section is based on his work. The actual 
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dimension and minimum distance of a number of Srivastava codes are given 
in [631] and [632]. Sugiyama et al. [1289] have found some good codes by 
modifying Srivastava codes. 


§7. Generalized BCH codes are described by Chien and Choy in [286], and 
many further properties of these codes are given in that paper. 


$8. For the Euclidean algorithm see for example Niven and Zuckerman [995], 
Stark [1272], and Uspensky and Heaslet [1359]. 


§9. The idea of expressing the key step in the decoding as finding the shortest 
recurrence that will produce a given sequence is due to Berlekamp [113, Ch. 7] 
and Massey [992]. Helgert [635] gave the key equation (67) for decoding 
alternant codes. See also Retter [1109, 1110] for Goppa codes. The use of the 
Euclidean algorithm for solving the key equation seems to be due to Sugi- 
yama et al. [1288] (see also [1290]), and our treatment follows that paper. Mills 
[960] has also noticed the connection between decoding and the continued 
fraction algorithm. A different decoding algorithm, also based on the 
Euclidean algorithm, has been given by Mandelbaum [903] (see also [902)). 
Yet another algorithm has been proposed by Patterson [1030]. See also Sain 
[1141]. 


Which decoding algorithm is best? Aho, Hopcroft and Ullman [16, $8.9] 
describe an algorithm which computes the GCD of two polynomials of degree 
n in O(n log’n) steps. By using this algorithm instead of the Euclidean 
algorithm, Sarwate [1145] shows that a t-error-correcting Goppa code of 
length n, for fixed t/n, can be decoded in O(n log’n) arithmetic operations. (A 
similar result for RS codes has been obtained by Justesen [706].) 

Similarly, using Theorem 29 of Ch. 9, a primitive binary BCH code of 
length n and rate R can be decoded up to its designed distance in O(n log n) 
arithmetic operations. These results are better than those obtained with the 
Euclidean algorithm, but unfortunately only for excessively large values of n. 
For practical purposes the original version of the Berlekamp algorithm 
(Berlekamp [113, Ch. 7]) is probably the fastest, although this depends on the 
machinery available for the decoding. Nevertheless, decoding using the 
Euclidean algorithm is by far the simplest to understand, and is certainly at 
least comparable in speed with the other methods (for n < 10°) and so it is the 
method we prefer. 


Reed-Muller codes 


$1. Introduction 


Reed-Muller (or RM) codes are one of the oldest and best understood 
families of codes. However, except for first-order RM codes and codes of 
modest block lengths, their minimum distance is lower than that of BCH 
codes. But the great merit of RM codes is that they are relatively easy to 
decode, using majority-logic circuits (see $86 and 7). 

In fact RM codes are the simplest examples of the class of geometrical 
codes, which also includes Euclidean geometry and projective geometry 
codes, all of which can be decoded by majority logic. A brief account of these 
geometrical codes is given in $8, but regretfully space does not permit a more 
detailed treatment. In compensation we give a fairly complete bibliography. 

$82, 3 and 4 give the basic properties of RM codes, and Figs. 13.3 and 13.4 
give a summary of the properties. Sections 9, 10 and 11 consider the 
automorphism groups and Mattson-Solomon polynomials of these codes. 

The next chapter will discuss 1* order RM codes and their applications, 
while Chapter 15 studies 2" order RM codes, and also the general problem of 
finding weight enumerators of RM codes. 


$2. Boolean functions 


Reed-Muller codes can be defined very simply in terms of Boolean 
functions. We are going to define codes of length n — 2", and to do so we shall 
need m variables vi,..., v, which take the values 0 or 1. Alternatively, let 
U-—(vU,...,v4) range over V", the set of all binary m-tuples. [In earlier 
chapters this set has been called F", but in Chs. 13 and 14 it will be more 
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convenient to call it V".] Any function f(v) = f(vi, ..., Vm) which takes on the 
values 0 and 1 is called a Boolean function (abbreviated B.f.). Such a function 
can be specified by a truth table, which gives the value of f at all of its 2" 
arguments. For example when m = 3 one such Boolean function is specified 
by the truth table: 


7,0000 1 1 11 
2001100 1 1 
v»,701010101 
f=00011000 


The last row of the table gives the values taken by f, and is a binary vector of 
length n =2” which is denoted by f. A code will consist of all vectors f, 
where the function f belongs to a certain class. 

The columns of the truth table are for the moment assumed to have the 
natural ordering illustrated above. 

The last row of the truth table can be filled in arbitrarily, so there are 27" 
Boolean functions of m variables. 

The usual logical operations may be applied to Boolean functions: 


f EXCLUSIVE OR g =f +g, 

f AND g = fg. 

fOR2-2f-^g224 fn. 

NOT f=f=1+f. 
The right-hand side of these equations defines the operations in terms of 
binary functions. ' 


A Boolean function can be written down immediately from its truth table: 
in the preceding example, 


(1) 


f = V1V203 OR viU: 


since the right-hand side is equal to | exactly when f is. This is called the 
disjunctive normal form for f (see Problem 1). 
Using the rules (1) this simplifies to (check!) 


f = vat v 02+ 0,03 + vw. 


Notice that v? = v. It is clear that in this way any Boolean function can be 
expressed as a sum of the 2" functions 


1, Vi, Us, ..., Ums UiUz, ViVa, + © © y Um aU sy U1U2 7 Um, (2) 


with coefficients which are 0 or 1. Since there are 2" Boolean functions 
altogether, all these sums must be distinct. 

In other words the 2" vectors corresponding to the functions (2) are 
linearly independent. 
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Problem. (1) (Decomposition of Boolean functions.) If f(v,,..., Um) isa 
B.f., show the following. 

(a) f(v,,..., Um) = Unf (Vis. <- 0, a, D OR osf(vi,..., 04, 0), and 

(b) f(t... 5 Um) = Vm8(Vi, . .. , Um- + h(v..., 04) where g,h are B.f.’s. 

(c) Disjunctive normal form: 


fOr: +5 Mm)= Do D fs sind Wi o wm, 


where 


Wi =v, Wr = Dp. 


Theorem 1. Any Boolean function f can be expanded in powers of v; as 


fi.. 0m) = NX g(awi ccc on, (3) 


aev™ 


where the coefficients are given by 


g(a) = $ flbi., bm), (4) 


and b C a means that the Vs in b are a subset of the l’s in a. 


Proof. For m = 1, the disjunctive normal form for f is 
f(v) = FCO) + v) + f(Dvi 
= f(0)1  (f(0) + f(1)v;, 
which proves (3) and (4). Similarly for m =2 we have 
f(v,, va) = f(0, OMI + v)(1 + v2) + f(0, D(I + vo? fC, 0)o(1 + v) 
+ f(1, Dvw = f(0, 0)1 + (f(0,0) + fC, O)}o, 
+ (f(0, 0) + f(0, 1)}v2+ (f(0, 0) + f(0, D) + f(1,0) + FUL, D}viva, 


which again proves (3) and (4). Clearly we can continue in this way. Q.E.D. 


Problem. (2) (The randomization lemma.) Let f(v:,..., Um-1) be an arbitrary 
Boolean function. Show that vo, + f(v1,..., Um-1) takes the values 0 and 1 
equally often. 


$3. Reed-Muller codes 


As in $2, v =(v1,..., Um) denotes a vector which ranges over V", and f is 
the vector of length 2" obtained from a Boolean function f(vi,..., Um). 
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Definition. The r^^ order binary Reed-Muller (or RM) code &(r, m) of length 
n —2". for Or m, is the set of all vectors f, where f(vi,...,v.) isa 
Boolean function which is a polynomial of degree at most r. 

For example, the first order RM code of length 8 consists of the 16 
codewords 


ad + 4,0, + 4202+ a303, a =Oor I, 


which are shown in Fig. 13.1. 


0 00000000 

v; 00001111 

02 00110011 

vi 01010101 

V2 + 0; 00111100 
vi +D 01011010 
v; t t; 01100110 
V:i + v+ 03 01101001 
1 11111 III 
1+, 11110000 
i+ v2 11001100 
l+o, 10101010 
1tvtv; 11000011 
itoit+o; 10100101 
it+o,+ v2 10011001 
ito, +v2tv; 10010110 


Fig. 13.1. The 1* order Reed-Muller code of length 8. 


This code is also shown in Fig. 1.14, and indeed we shall see that 9?(1, m) is 
always the dual of an extended Hamming code. &(1, m) is also the code €; 
obtained from a Sylvester-type Hadamard matrix in §3 of Ch. 2. 

All the codewords of R(1, m) except 0 and 1 have weight 2" '. Indeed any 
B f. of degree exactly | corresponds to a vector of weight 2"^!, by Problem 2. 

In general the r^ order RM code consists of all linear combinations of the 
vectors corresponding to the products 


l, v,,..., Um, 0102, U105, ..., Um-1Um,... (up to degree r), 


which therefore form a basis for the code. There are 


ee (epe) 
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such basis vectors vectors, and as we saw in §2 they are linearly independent. 
So k is the dimension of the code. 

For example when m = 4 the 16 possible basis vectors for Reed—Muller 
codes of length 16 are shown in Fig. 13.2 (check!). 





ti 
V304 
V204 
vU, 
V203 
vit; 
vt; 
020304 
0/0304 
D020, 
0/0203 
D1 020304 





eojoococoooooljooool|- 
ejooooceoooocoleocool- 
qgqoeoococooooe cole 
Sooo. oo cc oOo m= ooj- 
ejooocooceooooco=-o|- 
Sjo o o oojo = o ooo om ol- 
Sjo o oojo om. o o olo me me ole 
Sj- o o Oje.. ooo ole =e. Ole 
Sjo So ooo oo o o olo o o mjm 
OIS o o olo oom oole o omeje 
SJS o ooo oo ome ojom. omj- 
Ojo = o Oje o o- m Oje KH omj- 
Ojo o oo oojo oo o o o mjo OH mjm 
Ojo o = Ojo KH CORK o mje om mje 
OS o or(Ooe OK KH |0 m = meje 
Ld EE LEM LEM LEN p p p p eee pt | 


ce 
© 


Fig. 13.2. Basis vectors for RM codes of length 16. 


The basis vectors for the r^ order RM code of length 16, R(r, 4), are: 
Order r Rows of Fig. 13.2 


0 l 
I 1-5 
2 1-11 
3 1-15 
4 all 


Reed-Muller codes of length 2”*' may be very simply obtained from RM 
codes of length 2" using the |u | u + v| construction of $9 of Ch. 2. 


Theorem 2. 


Alr+l,m+1)= {|u| u+v|: u E R(r+1, m), v E R(r, m). 


For example 2 (1,4) was constructed in this way in $9 of Ch. 2. 
There is an equivalent statement in terms of generator matrices. Let 
G(r, m) be a generator matrix for R(r, m). Then the theorem says 


G(r+1,m+1)= (at m) E oo) 
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(Indeed, a codeword generated by this matrix has the form |u| u + v| where 
uc€a(r-l,m),vccdQ2X(rm)) 


Proof. By definition a typical codeword f in R(r+1,m-+1) comes from a 


polynomial f(v1,..., v4.1) of degree at most r+1. We can write (as in 
Problem 1) 
fw, baay Vm+1) = g(vi, EET Um) + Umsaih(v,, saig Vm), 

where deg (2) = r+ 1 and deg (h) <r. Let g and h be the vectors (of length 2”) 
corresponding to g(vı,..., Vm) and h(Vv:,..., Vm). Of course g € R(r+1,m) 
and h € &(r,m). But now consider g(tu,..., 0m) and vmsih(vi,..., Um) as 
polynomials in »4,..., Um+:. The corresponding vectors (now of length 2"*!) 
are |g | g| and |0] h| (see Problem 7). Therefore f = |g | g| ^ |0 | Al. Q.E.D. 


Notice the resemblance between the recurrence for RM codes in Theorem 
2 and the recurrence for binomial coefficients 


(a) 6e o © 
(See Problem 4.) 


The minimum distance of Reed-Muller codes is easily obtained from 
Theorem 2. 


Theorem 3. R(r, m) has minimum distance 2" *. 
Proof. By induction on m, using Theorem 33 of Ch. 2. Q.E.D. 


Figure 13.3 shows the dimensions of the first few [n,k, d] RM codes 
(check!) 

Notice that 2(m, m) contains all vectors of length 2"; R(m — 1, m) con- 
sists of all even weight vectors, and 22(0, m) consists of the vectors 0, 1 
(Problem 5). 


Theorem 4. 2(m —r— 1, m) is the dual code to Rr, m), for0Oxrzm- l. 


Proof. Take a € (m —r—1,m), b € &t(r, m). Then a(vı,..., Vm) is a poly- 
nomial of degree =m —r — 1, b(vi,..., vm) has degree <r, and their product 
ab has degree xm —1. Therefore ab € R(m —1, m) and has even weight. 
Therefore the dot product a: b 0 (modulo 2). So 2(m-r-1,m)C 
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lengthn| 4 8 16 32 64 128 256 512 
m 23 4 5 6 7 8 9 





distance d dimension k 

1 4 8 16 32 64 128 256 512 

2 3 7 15 31 63 127 255 511 

4 1 4 11 26 57 120 247 502 

8 1 5 16 42 99 219 466 
16 1 6 22 64 163 382 
32 1 7 29 93 256 
64 1 8 37 130 
128 l 9 46 
256 1 10 
512 1 


Fig. 13.3. Reed-Muller codes. 


Alr, m). But 
dim A(m — r — 1, m)+ dim A(r, m) 


are a pace 08e pe peces 


which implies 33m —r—1,m)- £(r, m). Q.E.D. 


These properties are collected in Fig. 13.4. 


For any m and any r, O <r = m, there is a binary r* order RM code 
Qr, m) with the following properties: 


length n — 2", 
dimension k = 1 + ( 


minimum distance d = 2” ’. 


9t (r, m) consists of (the vectors corresponding to) all polynomials 
in the binary variables v,,...,v, of degree xr. The dual of 
9 (r, m) is (m —r—1,m) (Theorem 4). R(r, m) is an extended 
cyclic code (Theorem 11), and can be easily encoded and decoded (by 
majority logic, $86, 7). Decoding is especially easy for first-order RM 
codes (Ch. 14). Aut (2 (r, m)) = GA (m) (Theorem 24). Good practical 
codes, also the basis for constructing many other codes (Ch. 15). 


Fig. 13.4. Summary of Reed-Muller codes. 
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Punctured Reed-Muller codes. 


Definition. For 0<r<m — 1, the punctured RM code &(r, m)* is obtained by 
puncturing (or deleting) the coordinate corresponding to v,=--:- =v, =0 
from all the codewords of &(r, m). 


(In fact we shall see in the next section that an equivalent code is obtained 
no matter which coordinate is punctured.) 

Clearly R(r, m)* has length 2" — 1, minimum distance 2”~’ — 1, and dimen- 
sion 1 (D) *--: € (7). 


Problems. (3) Show that &(r, m) is a subcode of A(r + 1, m). In fact show that 
9r +1, m) = (a +b: a E R(r, m), b is zero or a polynomial in vi, .. . , Vm of degree 
exactly r 4 1). 

(4) If Theorem 2 is used as a recursive definition of RM codes, use 
Equation (5) to calculate their dimension. Obtain Fig. 13.3 from Pascal's 
triangle for binomial coefficients. 

(5) Show that 2(0,m) and &(0, m)* are repetition codes, €2(m — 1, m) 
contains all vectors of even weight, and R(m, m) and R(m — 1, m)* contain 
all vectors, of the appropriate lengths. 

(6) Let (S| T|= {ls |t|: s E€ Sj t€ T}. Show that 


g(rc1,m-1-U|S|S]| 


where S runs through those cosets of &(r,m) which are contained in 
9r 1, m). 

(7) Let f(vi,..., Un) be a B.f. of m variables, and let f be the cor- 
responding binary vector of length 2". Show that the vectors of length 2""' 
corresponding to g(s,...,0,4,:) = f(vs,...,v,) and to h(v,..., msi) = 
Vmsif(V1,..., Um) are [f| f| and |0| f| respectively. 


$4. RM codes and geometries 


Many properties of RM codes are best stated in the language of finite 
geometries. The Euclidean geometry EG(m, 2) of dimension m over GF(2) 
contains 2" points, whose coordinates are all the binary vectors v = 
(9,,..., Um) of length m. If the zero point is deleted, the projective geometry 
PG(m — 1,2) is obtained. (See Appendix B for further details.) 

Any subset S of the points of EG(m, 2) has associated with it a binary 
incidence vector x(S) of length 2", containing a 1 in those components s € S, 
and zeros elsewhere. 
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This gives us another way of thinking about codewords of &(r, m), namely 
as (incidence vectors of) subsets of EG(m, 2). 

For example, the Euclidean geometry EG(3,2) consists of 8 points 
Po, P,,..., P; whose coordinates we may také to be the following column 
vectors: 

Po P, Pa P; P4 Ps Ps P- 
v5 1 1 1 1000 0 
v; 1 100 1 10 0 


& 1 0 10 10 1 0 


The subset S = {P2, P3, Ps, Ps} has incidence vector 
x(S) = 00111100. 


This is a codeword of 2(1,3)-— see Fig. 13.1. 
For any value of m let us write the complements of the vectors v,,,..., ti, 
as follows: 


öö %10---1010. 


We take the columns of this array to be the coordinates of the points in 
EG(m, 2). In this way there is a 1-to-1 correspondence between the points of 
EG(m, 2) and the components (or coordinate positions) of binary vectors of 
length 2". Any vector x of length 2" describes a subset of EG(m,2), 
consisting of those points P for which xp =1. Clearly x is the incidence 
vector of this subset. The number of points in the subset is equal to the 
weight of x. 

For example the vectors v; themselves are the characteristic vectors of 
hyperplanes which pass through the origin, the vw; describe subspaces of 
dimension m — 2, and so on. (Of course there are other hyperplanes through 
the origin besides the v. For example no v; contains the point I1--- 1. 
Similarly the vw; are not the only subspaces of dimension m — 2, and so on.) 

One of the advantages of this geometrical language is that it enables us to 
say exactly what the codewords of minimum weight are in the r™ order 
Reed-Mulier code of length 2": they are the (m — r)-flats in EG(m, 2). This is 
proved in Theorems 5 and 7. In Theorem 9 we use this fact to determine the 
number of codewords of minimum weight. Then in $5 we show that the 
codewords of minimum weight generate R(r, m). Along the way we prove 
that the punctured code &(r, m)* is cyclic. The proofs given in 884, 5 should 
be omitted on a first reading. 

Let H be any hyperplane in EG(m,2). By definition the incidence vector 
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h = x(H) consists of all points v. which satisfy a linear equation in v;,. .. , Um. 
In other words, the Boolean function h is a linear function of v,,..., Um, and 
so is a codeword of weight 2" ' in R(r, m). 

We remark that if f € R(r,m) is the incidence vector of a set S, then 
hf € R(r +1, m) and is the incidence vector of S N H. We are now ready for 
the first main theorem. 


Theorem 5. Let f be a minimum weight codeword of R(r, m), say f = x(S). 
Then S is an (m — r)-dimensional flat in EG(m,2) (which need not pass 
through the origin). 

E.g. the 14 codewords of weight 4 in the [8,4,4] extended Hamming code 
are the 14 planes of Euclidean 3-space EG(3, 2). 


Proof. Let H be any hyperplane EG(m — 1,2) in EG(m, 2) and let H’ be the 
parallel hyperplane, so that EG(m, 2)= H UH’. 

By the above remark SN H and S N H’ are in &(r+ 1, m), and so contain 
0 or z2"^'*' points. Since |S] 227" =|S NA H| - | H'|, |S A H|20, 27777! 
or 2"7*. The following Lemma then completes the proof of the theorem. 


Lemma 6. (Rothschild and Van Lint.) Let S be a subset of EG(m, 2) such that 
IS| » 2""*, and |S N H|- 0, 2" or 2" for all hyperplanes H in EG(m, 2). 
Then S is an (m — r)-dimensional flat in EG(m, 2). 


Proof. By induction on m. The result is trivial for m — 2. 


Case (i) Suppose for some H, |S' H|-2"". Then SCH, ie. SC 
EG(m — 1,2). Let X be any hyperplane in H. There exists another hyperplane 
H” of EG(m,2) such that X = H N H”, and SN X = SN H”, ie. |SOX|=0, 
27-071 or 277-7». By the induction hypothesis S is an ((m — 1) ^ (r — 1))- 
flat in EG(m — 1,2) and hence in EG(m, 2). 


Case (ii). If for some H, |SQH|=0, then replacing H by its parallel 
hyperplane reduces this to case (1). 


Case (iii). It remains to consider the case when |S N H| » 2" for all H. 
Consider 


uum, ISO HIE ` (Z xw) 


HCÉG(m.2) \a ES 


S 2, 2 Nos 2 Xu (a)xu (b) 


-|SI2" - 0 * |SS|- 02" - 1) 
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since there are 2" — 1 hyperplanes in EG(m, 2) through a point and 2"^'— 1 
through a line. The LHS is 2?" ?^'(2" — 1). Substituting |S| = 27" on the RHS 
leads to a contradiction. Q.E.D. 


The converse of Theorem 5 is: 
Theorem 7. The incidence vector of any (m — r)-flat in EG(m, 2) is in R(r, m). 


Proof. Any (m — r)-flat in EG(m, 2) consists of all points v which satisfy r 
linear equations over GF(2), say 


Y ajv = b, LT Vi 
j=l 
or equivalently 
3 avtbtl- Ll. te-l2 vx 
i=! 
This can be replaced by the single equation 


IT(S asy + b, + 1) = l, 


i=l \j=t 


i.e., by a polynomial equation of degree <r in v,,..., Um. Therefore the flat is 
in R(r, m). Q.E.D. 


Combining Theorems 5 and 7 we obtain 


Theorem 8. The codewords of minimum weight in 9(r,m) are exactly the 
incidence vectors of the (m — r)-dimensional flats in EG(m, 2). 


Minimum weight codewords in &(r,m)* are obtained from minimum 
weight codewords in &(r, m) which have a | in coordinate 0 (by deleting that 
1). Such a codeword is the incidence vector of a subspace PG(m — r — 1,2) in 
PG(m - 1,2). (We remind the reader that in a projective geometry there is no 
distinction between flats and subspaces - see Appendix B.) 


Theorem 9. The number of codewords of minimum weight in: 
(a) &(r, m)* is 
m-r-t ami EI 1 


Axr- = II "IY 
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(b) &(r, m) is 


Proof. Theorems 3 and 5 of Appendix B. 


*85. The minimum weight vectors generate the code 


Theorem 10. The incidence vectors of the projective subspaces PG(yu — 1,2) of 
PG(m - 1,2) generate R(r, m)*, where u ^ m — r. 


Proof. (The beginner should skip this. Let o be a primitive element of 
GF(2"7). Then the points of PG(m-1,2) can be taken to be 
f1, a, a, à7,..., a? 7}. Let | 22" - 2. 

A subset T = (a^,..., 0^) of these points will be represented in the usual 
way by the polynomial 


Wr(x) = x^: x^. 


If T ={a®,...,a%} is a PG(u — 1,2) then the points of T are all nonzero 
linear combinations over GF(2) of u linearly independent points ao, ..., 0, 
(say) of GF(2"). In other words the points of T are 


u-! 
© ajo; = o^, i-0,1,...,1, 
e$ 


where (adio, 4j,...,4,,-)) runs through all nonzero binary y-tuples. Also 
xw7(x) represents the PG(yu —1,2) spanned by oa,,...,0o0o,-,. Thus every 
cyclic shift of the incidence vector of a PG(y — 1, 2) is the incidence vector of 
another PG(y — 1, 2). 

Let € be the code generated by all wz(x), where T is any PG(y — 1,2). 
Clearly € is a cyclic code and is contained in R(r, m)*; the theorem asserts 
that in fact € = 9t(r, m)*. We establish this by showing that 

dim € > iem) () 
1 r 
The dimension of € is the number of a^ which are not zeros of €; i.e. the 
number of o? such that wz(a^) z 0 for some T. Now 
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where the summation extends over all nonzero binary y-tuples b= 
(bo, .. . , b, 4). Call this last expression F,(o, ...,0o, ,). Then 


F, (ao, ett, qt, 1) = ` (boæo + y». 


bo: ++ bai 


where y = bait: t b, aou 


= € (+ (eot yy) 


bis bui 
sal 
X exer) 
by bua j=t \J 


(ate... $E ay: Qu) 


5 


yt 


i=l B j (6) 


This is a homogeneous polynomial of degree s in æo,...,@u-ı 

Then dim € is the number of s such that F,(a@,..., o, ;) is not identically 
zero, when the o; are linearly independent. 

In fact we will just count those F,(a@o,..., @:-:) which contain a coefficient 
which is nonzero modulo 2. We note that such an F 

(i) cannot be identically zero, and 

(ii) cannot have a, ..., o, linearly dependent (Problem 8). From Lucas’ 
theorem, a multinomial coefficient 


is nonzero modulo 2 iff 
Goi t Gi occ tuo; S (sy for all i, Osismc- 1l, 


where (x), denotes the i bit in the binary expansion of x. 

Therefore (6) contains a nonzero coefficient whenever the binary expansion 
of s contains =p l's. For example if s —2*-F 25 c - - - c 2«-, (6) contains a 
nonzero coefficient corresponding to jo = 2*,...,j, = 2^^-. 

The number of such s in the range [1, 2" — 1] is 


Gar QOO Q.E.D. 


Problem. (8) Show that if ao,...,@,-, are linearly dependent, then 
F. (ao, ..., @-1) is identically zero modulo 2. 
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Important Remark. For nonnegative integers s let w;(s) denote the number of 
1’s in the binary expansion of s. Then the proof of this theorem has shown 
that a^ is a nonzero of R(r, m)* iff 1 s 2" —1 and w;(s) 2 u. Or in other 
words, 


Theorem 11. The punctured RM code R(r, m)* is a cyclic code, which has as 
zeros a’ for all s satisfying 


1s wéYís)&m-r-1 and 1s«sx2"-—2. 


The generator and check polynomials, for the punctured RM code R(r, m)* 
are, forÜ rm - 1, 


&£- | [I M(x) (7) 
ee 
h(x) = (x +1) I1 M(x) (8) 
m-—rw»sy&m-l 
D«sw2^"7—2 


where s runs through representatives of the cyclotomic cosets, and M(x) is 
the minimal polynomial of a^. (Remember that an empty product is 1.) 

An alternative form is obtained by using a'' instead of @ as the primitive 
element. This has the effect of replacing s by 2"7—1—s and ws) by 
m — ws). Then 


go- JL. Mew (9) 
las a2™-2 
A(xy=O+1) TT Meco, (10) 
eid 


If the generator polynomial of &(r, m)* is given by Equation 7, then by 
Theorem 4 of Ch. 7, the dual code has generator polynomial (10). 
The idempotent of R(r, m)* is 
a+ X 6 


D&wo(s)wr 
1«s«2^ —2 


or equivalently (again replacing a by a^), 


0o * > 0, 


mc-rxwa(sysm-l 
1«5«2"7—2 


where s runs through the representatives of the cyclotomic cosets, and 6, is 
defined in Ch. 8. 
The dual to R(r, m)*, by Theorem 5 of Ch. 8, has the idempotent 


(1+ 6+ > e.) = » ø 
m-—r«w»(sy&m-t lew2(t}am—r—t 


I&s«27"7-2 tæta?” —2 
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For example, the idempotents of R(1, m)* and R(2, m)* may be taken to be 


ba + 0* 
and 


m-l 
64-014 DY ož 
i-((Qn* 1)/2] 
respectively, where l; = 1 -- 2. Then general forms for codewords in various 
RM codes of length 2" are as follows. Here the first part of the codeword is 
the overall parity check, and the second part is in the cyclic code 3?*. 


RO, m): [a | aða] 
RA, m): |a | aða + aix^o?| 
m-i 
R(2, m): la|a6,* aix^o*- Y ax"^et| 


i=((m+1)/2)] 
R(m -2,m): |f| FA + 8) 
$&(m-—1,m): |f| f] 
where a,, a, E GF(2), i, sı E {0, 1,2,...,2” —2}, and f(x) is arbitrary. 
Nesting habits. Figure 13.5 shows the nesting habits of BCH and punctured 
RM codes. Here (d) denotes the BCH code of designed distance d, and the 
binary numbers in parentheses are the exponents of the zeros of the codes 


(one from each cyclotomic coset). The codes get smaller as we move down 
the page. 


Rim-1*=@(4) {2} 
Rim-2)*=@(3) {4} 
Q5) {1,44} 
a(7) (1:32:01) 


&m-3* [13 4,404,1001,...] ais) {1,14,104,414} 
800 {4,44,404,414,1004 } 
802 {4,14,104,114,1001,1044} 
BUSHA, 44,404, 114,1004,104 4,4 104} 


Rim-4)* (wr < 3} B17) {4,....1444} 
| 


Fig. 13.5. Nesting habits of BCH and punctured RM codes. 
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We see that 


R(r, m)* C BCH code of designed distance 2" ' — 1, 
9t (r, m) C extended BCH code of designed distance 2””’ — 1. 


Theorem 12. The incidence vectors of the (m — r)-flats in EG(m,2) generate 
Rr, m). 


Proof. We recall that R(r, m) is R(r, m)* with an overall parity check added. 
By Theorem 10, the incidence vectors of the (m — r)-flats with a 1 in 
coordinate 0 generate R(r,m). So certainly all the (m — r)-flats generate 
& (r, m). Q.E.D. 


We mention without proof the following generalization of this result. 


Theorem 13. (MacWilliams and Mann.) The rank over GF(p) of the incidence 
matrix of the hyperplanes of an m-dimension Euclidean or projective geometry 
over GF(p’) is 


Gales 
m 


where € = +1 for the projective and 0 for the Euclidean geometry. 


Research Problem (13.1). Is there a codeword a(x)€ R(r, m)* which is the 
incidence vector of a PG(m — r — 1,2) and generates R(r, m)*? 


86. Encoding and decoding (1) 


There are two obvious ways to encode an RM code. The first uses the 
generator matrix in the form illustrated in Fig. 13.2 (this is a nonsystematic 
encoder). The second, which is systematic, makes use of the fact (proved in 
Theorem 11) that RM codes are extended cyclic codes. In this § we give a 
decoding algorithm which applies specifically to the first encoder, and then in 
$7 we give a more general decoding algorithm which applies to any encoder. 

We illustrate the first decoder by studying the [16, 11, 4] second-order RM 
code of length 16, R(2, 4). As generator matrix G we take the first 11 rows of 
Fig. 13.2. Thus the message symbols 


A = 4090403020 10340240 140230 15012 
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are encoded into the codeword 


x= aG = al + asgvats ++ a0, ++ +++ 420102 (11) 

= XoXı ttt Xis (say). 
This is a single-error-correcting code, and we shall show how to correct one 
error by majority logic decoding. (Unlike the majority logic decoder for RS 


codes given in Ch. 10, this is a practical scheme.) The first step is to recover 
the 6 symbols a2, ..., 434. Observe from Fig. 13.2 that if there are no errors, 


di2 7^ Xo t Xy. Xo cb Xs 
= xat xst X. x, 
= Xg-k Xo+ Xiot Xii 
= Xn t Xn t Xuat Xis, (12) 


di3 = Xot Xi + Xa t Xs 
= xt x+ xs X, 
= Xg + Xo+ Xz+ X13 
= Xot Xuc Xat Xs, (13) 
a34 = Xot Xa+ Xs tX 
=x + xs +X +X 
= Xz + Xe + Xio + Xia 
= x+ X+ Xut xis. 
Equation (12) gives 4 votes for the value of an, Equation (13) gives 4 votes 
for a, and so on. So if one error occurs, the majority vote is still correct, and 


thus each a, is obtained correctly. 
To find the symbols a,..... a,, subtract 


A340304 + o cc ditt, 


from x, giving say x' = xoxi: cc xis. Again from Fig. 13.2 we observe that 
d, = xot Xi 
-xetxi 
= Xtat Xis, 
a= Xo + X; 


Now it is easier: there are 8 votes for each a,, and so if there is one error the 
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majority vote certainly gives each a; correctly. It remains to determine do. We 
have 


x'—x'—ày$,—:-::—daà,ti 


= ael + error, 


and a, — 0 or 1 according to the number of 1’s in x”. 

This scheme is called the Reed decoding algorithm, and will clearly work 
for any RM code. 

How do we find which components of the codeword x are to be used in the 
parity checks (12), (13),...? To answer this we shall give a geometric des- 
cription of the algorithm for decoding &(r,m). We first find a,, where 
v9 —90,:::0c, Say. The corresponding row of the generator matrix, €, Vo, 
is the incidence vector of an (m — r)-dimensional subspace S of EG(m, 2). For 
example, the double line in Fig. 13.6 shows the plane S corresponding to awn. 
Let T be the "complementary" subspace to S with incidence vector 


$4770 Ur,» Where {Ti,...,Tm-} is the complement of {a,...,0,} in 
{1,2,..., m). Clearly T meets S in a single point, the origin. 
Let U,,..., U;-- be all the translates of T in EG(m, 2), including T itself. 


(These are shaded in Fig. 13.6.) Each U, meets S in exactly one point. 


Theorem 14. If there are no errors, a, is given by 
a= >, xp, i=1,...,2"7 












PAZ z^ S AN 
PAN 


zzz 






ax 


4404 






Fig. 13.6. EG(4, 2) showing the subspace S (double lines) and U,,..., U, (shaded). 
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These equations are a generalization of Equations (12), (13), and give 2"* 
votes for ao. 


Proof. Because of the form of the generator matrix, the codeword x is 


x= > Goo, ` ` * Up, 
P=P1°" "Ps 
where the sum is over all subsets {p:,..., ps} of {1,..., m} of size at most r. 
(This generalizes Equation (11)). Therefore 


A, dm 2 ae A, (vs 4 to )r 
= > a,N(U,, p), 


where N(U, p) is the number of points in the intersection of U, and the 
subspace W with incidence vector 0,,- * * t,,. 

We use the fact that the intersection of two subspaces is a subspace, and 
all subspaces (except points) contain an even number of points. Now T and 
W intersect in the subspace 


Vr, S U., -Vp UE Vo, 


If s <r, this subspace has dimension at least 1, and N(U;, p) is even. On the 
other hand, if s = r but W S, then one of the p, must equal one of the 7;, say 
pi7 Tı. Then T and W intersect in 


04,7770, 05^ ^ Von 


which again has dimension at least 1, and N(U,, p) is even. Finally, if W = S, 
N(U, p) = 1. Q.E.D. 


This theorem implies that, if no more than [4(2"" — 1)] errors occur, 
majority logic decoding will recover each of the symbols a, correctly, where 
o is any string of r symbols. The rest of the a's can be recovered in the same 
way, as shown in the previous example. Thus the Reed decoding algorithm can 
correct [Xd — 1)] = [327 * — 1)] errors. 


$7. Encoding and decoding (II) 


The Reed decoding algorithm does not apply if the code is encoded 
systematically as an extended cyclic code, as in $8 of Ch. 7. Fortunately 
another majority logic decoding algorithm is available, and its description will 
lead us to a more general class of codes, the finite geometry codes. 

If € is any [n, k] code over GF(q), the rows of the H matrix are parity 
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checks, i.e. define equations 


$ hix; =0 
i-0 


which every codeword x must satisfy. Of course any linear combination of 
the rows of H is also a parity check: so in all there are q"'* parity checks. 
The art of majority logic decoding is to choose the best subset of these 
equations. 


Definition. A set of parity check equations is called orthogonal on the i* 
coordinate if x, appears in each equation, but no other x; appears more than 
once in the set. 


Example. Consider the [7,3, 4] simplex code, with parity check matrix 


1101000 
0110100 
H- 0011010 |’ 


0001101 
Seven of the 16 parity checks are shown in Fig. 13.7. 


0123456 
1101000 
0110100 
0011010 
0001101 
1000110 
0100011 
1010001 


Fig. 13.7. Parity checks on the [7,3, 4] code. 


Rows 1, 5, 7 of Fig. 13.7 are the parity checks 
Xo t Xi x= 0, 
Xo t X4 X5— 0, 
Xo X:+ Xe = 0, 
which are orthogonal on coordinate 0. Of course these correspond to the lines 
through the point 0 in Fig. 2.12. Similarly the lines through 1 give three parity 
checks orthogonal on coordinate 1, and so on. 
Suppose now that an error vector e occurs and y= x +e is received. If 


there are J parity checks orthogonal on coordinate 0, we define S,,..., S to 
be the result of applying these parity checks to y. In the above example we 
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have 
Si = yo yit ys = ot et es, 


So = yot Yat ys = Cot e, Cs, 
37 Yot yat Yo = Cot ot es. 


Theorem 15. If not more than [3J] errors occur, then the true value of eo is the 
value taken by the majority of the S;'s, with the rule that ties are broken in favor of 
0. 


Proof. Suppose at most JJ errors occur. (i) If e, —0, then at most [J] 
equations are affected by the errors. Therefore at least J] of the S,’s are 
equal to 0. (ii) If e; = 1, then less than [J] equations are affected by the other 
errors. Hence the majority of Ss are equal to 1. Q.E.D. 


Corollary 16. If there are J parity checks orthogonal on every coordinate, the 
code can correct [iJ] errors. 


Remarks. (i) If the code is cyclic, once a set of J parity checks orthogonal on 
one coordinate has been found, J parity checks orthogonal on the other 
coordinates are obtained by cyclically shifting the first set. 

(ii) The proof of Theorem 15 shows that some error vectors of weight 
greater than [iJ] will cause incorrect decoding. However, one of the nice 
features of majority logic decoding (besides the inexpensive circuitry) is that 
often many error vectors of weight greater than [iJ] are also corrected. 

(iii) Breaking ties. In case of a tie, the rule is to favor O if it is one of the 
alternatives, but otherwise to break ties in any way. Equivalently, use the 
majority of (0, S, Sa.. .. 

This method of decoding is called one-step majority logic decoding. 
However, usually there are not enough orthogonal parity checks to correct up 
to one-half of the minimum distance, as the following theorem shows. 


Theorem 17. For a code over GF(q), the number of errors which can be 
corrected by one-step majority logic decoding is at most 


n-li 


2(d’— 1)’ 


where d' is the minimum distance of the dual code. 


Proof. The parity checks orthogonal on the first coordinate have the form 
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since each corresponds to a codeword in the dual code. Therefore J « 
(n — D/(d'- 1). By Remark (ii) above, some error pattern of weight [in — 
D/(d' - 1)] + 1 will cause the decoder to make a mistake. Q.E.D. 


Examples. (1) For the [23, 12, 7] Golay code, d' = 8, and so at most [22/2.7] = 1 
error can be corrected by one-step decoding. 


(2) Likewise most RS codes cannot be decoded by one-step decoding, since 
d'2n-d-2. 


However, there are codes for which one-step majority logic decoding is 
useful, such as the [7,3,4] code of Fig. 13.7 and more generally the diff- 
erence-set cyclic codes described below. 


L-step decoding. Some codes, for example RM codes, can be decoded using 
several stages of majority logic. 


Definition. A set of parity checks S,, $,... is called orthogonal on coor- 
dinates a, b,..., cif the sum x, + x, + + -+ x, appears in each S, but no other 
x, appears more than once in the set. 


Example. 2-step decoding of the [7, 4,3 
0 


— 


code. The 7 nonzero parity checks are 


Ome om oju 


oor kK FO | 





- O O — = = ciu 





Or O O m me KIN 





- O - O O- —|— 





There are two parity checks orthogonal on coordinates 0 and 1, namely 


S, = ey t e e; +e, 
S2= ete +e; + es, 


two which are orthogonal on coordinates 0 and 2, 
S,-etete +24, 
S3= e + €; te. +e, 


and so on. Suppose there is one error. Then the majority rule gives the correct 
value of e + e, (from the first pair of equations), and of e+ e, (from the 
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‘second pair). Now the equations 
S= eot € 
Ss =€o +e 


are orthogonal on e, and again the majority rule gives eo. This is a two-step 
majority logic decoding algorithm to find eo. A circuit for doing this is shown 
in Fig. 13.8. 









Fo = MAJORITY GATE 


Fig. 13.8. Two-step majority decoding of the [7, 4, 3] code. 


Since the code is cyclic, it is enough to design a decoder which corrects the 
first coordinate. The others are then corrected automatically. 

A decoder which has L levels of majority logic is called an L-step decoder. 
The basic idea, as illustrated in the preceding example, is that the number of 
coordinates in the check sums which are being estimated decreases from level 
to level, until at the final step we have estimates for the individual coor- 
dinates. 


Lemma 18. If there are J checks at each stage of the decoding, then [}J] errors 
can be corrected. 


Proof. As for Theorem 15. Q.E.D. 


Even with L-step decoding it may not be possible to correct up to half the 
minimum distance. 


Theorem 19. For a code over GF(q), the number of errors which can be 
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corrected by L-step majority decoding is at most 


n_i 
dU 


where d' is the minimum distance of the dual code. 


Proof. Suppose there are a set of J parity checks orthogonal on / coordinates. 
We shall show that J <2n/d'—1, which by Remark (ii) above proves the 
theorem. Let the i" parity check involve a; coordinates besides the I. Since 
these checks correspond to codewords in the dual code, we have 


l*a;zd', a; + a> d'(is j). (14) 
Set S = Zl, a. Then S «n - I, and from (14), 


J4S2Jd, -DS2 (2a. 


Eliminating | gives n — S = (Jd' — S)/J, and eliminating S then gives 
(J - Dd' «2n —- 2d'. Q.E.D. 


Example. For the [23, 12, 7] Golay code, L-step decoding cannot correct more 
than 2 errors. 


In contrast to this, we have: 


Theorem 20. For the r^ order RM code &(r, m), (r + 1)-step majority decoding 
can correct [Xd — 1)] = [42"* — 1)] errors. 


Proof. The dual code is Z;(m — r — 1, m) and by Theorem 8 the low weight 
codewords in the dual code are the incidence vectors of the (r + 1)-dimensional 
flats in EG(m, 2). 

Let V be any r-flat. We will find a set of parity checks orthogonal on the 
coordinates yp, P € V. In fact, let U be any (r + 1)-flat containing V. Now 
each of the 2" —2' points not in V determines a U, and each U is determined 
by 2"'—2' such points. Therefore there are (2" — 2’)/(2"*'—2')=2""-1 
different U's. Any two such U's meet only in V. Thus we have an estimate 
for the sum 

> yr. 


PEV 


This estimate will be correct provided no more than [3(2”-" — 1)] errors occur. 
We repeat this for all r-flats V. 
Next, let W be any (r — 1)-flat, and let V be any r-flat containing W. There 
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are 2”-'*'—1 such V’s and from the first stage we know the values of the 
corresponding sums. Therefore we can obtain an estimate for the value of the 


sum 
> yr. 


PEW 


Proceeding in this way, after r + 1 steps we finally arrive at an estimate for yp, 
for any point P, which will be correct provided no more than [Xd — 1)] = 
[32^ * — 1)] errors occur. Q.E.D. 


Improvements of the decoding algorithm. A more practical scheme than the 
preceding is to use the cyclic code R(r,m)*, since then one need only 
construct a circuit to decode the first coordinate. The dual code &(r, m)** 
now contains the incidence vectors of all (r + 1)-flats in EG(m, 2) which do not 
pass through the origin. 

This is illustrated by the [7, 4, 3] code, for which we gave a two-step 
decoding algorithm in Fig. 13.8. This code is in fact 22(1,3)*. 

The following technique, known as sequential code reduction, considerably 
reduces the number of majority gates, but at the cost of increasing the delay 
in decoding. 

We shall illustrate the technique by applying it to the decoder of Fig. 13.8. 
The idea is very simple. In Fig. 13.8, let 


S r 1 
4» hs Pager 


denote the output from the first majority gate at successive times. Then 


Sa= eote, 

Si eite, 
Note that 

S,4 S,= Ss! 


So if we are willing to wait one clock cycle, S; can be obtained without using 
the second majority gate. The resulting circuit is shown in Fig. 13.9. 

In general this technique (when it is applicable) will reduce the number of 
majority gates, adders, etc., from an exponential function of the number of 
steps to a linear function, at the cost of a linear delay (as in Fig. 13.10). 

The name "sequential code reduction" comes from the fact that at each 
successive stage of the decoder we have estimates for additional parity 
checks, and so the codeword appears to belong to a smaller code. In the 
example, after the first stage we know all the sums e, +e, which are in fact 
parity cnecks on the (7, 1, /] repetition code. 

Unfortunately, sequential code reduction doesn't apply to all codes (and 
even when it does apply, it may be a difficult task to find the best decoder). 
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Fig. 13.9. Decoder for [7,4, 3] code using sequential code reduction. 








—UITIDHATIDHATIT« x 


Circuit using sequential code reduction. Circuit for L-step decoding. 
Fig. 13.10. 


References to this and to other modifications and generalizations of the basic 
algorithm are given in the notes. 


Threshold decoding. Let us return briefly to the problem of decoding an 
arbitrary binary linear code. Suppose the codeword x is transmitted and y is 
received. The decoder decides that the most likely error vector é is the coset 
leader of the coset containing y, and decodes y as $ — € -- é. We saw in 
Theorem 5 of Ch. 1 that é is a function of the syndrome S. More precisely é is 
a binary vector-valued function of the n — k components S), S2,..., Sr- of S 
(see Fig. 13.11). For small codes we could synthesize é by a simple com- 
binational logic circuit using AND's, OR’s, etc. (but no delay elements). This 
can be further simplified if the code is cyclic, in which case the circuit is 


BUFFER 
COMPUTE 
SYNDROME 


^ 
x 





COMPUTE 
e 





Fig. 13.11. 
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called a Meggitt decoder. Figure 13.8 is a simple example of a Meggitt 
decoder. See also Problem 36 of Ch. 7. 

But larger codes (e.g. BCH codes, Fig. 9.5) require more complicated 
components to synthesize ê For example in this chapter we have used 
majority gates to decode RM codes. A more general notion is that of a 
weighted majority gate, or threshold gate, which is a function 6(v,,..., Vm) of 
Ui,... 5 Um, With real weights a,,..., am, defined by 


0(v,,..., Um) = (15) 


where these sums are real. 

In contrast to majority logic decoding any code can be decoded by a 
one-step threshold gate decoding algorithm. It is easy to show that this can 
always be done: what is hard is to find an efficient way of doing it. 


Theorem 21. (Rudolph.) Any binary linear code can be decoded by one-step 
threshold gate decoding. 


Proof. Write ê = (fi, .. . , fa), where each component f; = fi(S) = fi(Si,..-, S.) 
is a function of the syndrome. Let F,(S) = 1 — 2f,(S) be the corresponding real 
+1-valued function. By Equation (11) of Ch. 14, F,(S) can be written 


= E Raya. 


n—k 
2 uev” 


F(S)- 


where the F,(u) are the Hadamard coefficients of F,(S) given by Equation (8) 
of Ch. 14. Then 


X Ac), 


uev” 


f) 5 (17 3 


If ð is any threshold gate function of the 2"* inputs Ei:f uS, u € V"*, with 
weights a,, it is immediate from the definition of 0 that 


«(x uS: u E ve) =3(1 — sgn > . a,(— 5r 


uev"™ 
where sgn(x) = 1 if x z0, = —1 if x <0. Therefore 
AS) = (( us; u e v™) (16) 


if we take a, = É(u)/2"*. Since we can do this for each i, the theorem 
follows. Q.E.D. 
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Unfortunately Equation (16) represents é as a threshold function of all the 
2” parity checks, so this is not a practical algorithm. However in a number 
of cases it has been possible to find a different one-step threshold gate 
realization of ê which involves many fewer parity checks (see Notes). 


Research Problem (13.2). Find the most efficient one-step threshold gate 
realization of a given Boolean function. 


$8. Other geometrical codes 


(D Difference-set cyclic codes. Let IT be the projective plane PG(, p^) of 
order p? (see Appendix on Finite Geometries). TI contains n = p^ +p°+1 
points, which can be represented as triples 


(Bi, B2, Bs), B: € GF(p’*). 


Note that (AB, AB», AB3), A € GF(p’)*, is the same point as (f, B», B3). Each 
triple can be regarded as an element of GF(p*^), i.e. can be written as a power 
of a, where a is a primitive element of GF(p**). Some scalar multiple of each 
triple is equal to a! for 0 € i< n. We label the n points of the plane by these 
powers of a. 

Let a^,...,a*, | = p? +1, be a line of II. The incidence vector of this line 
has 1’s in exactly the coordinates i,,...,i. By the proof of Theorem 10, any 
cyclic shift of this vector is the incidence vector of another line. Since there 
are n shifts and n lines, every line of II is obtained in this way. 

Let £ be the code generated over GF(p) by these n incidence vectors, and 
let € = D*+. Clearly D is a cyclic code of length n. From Theorem 13, @ has 
dimension ('3')' + 1. P 

€ can be decoded by one-step majority logic, as follows. The incidence 
vectors of the l = p° +1 lines through a point of H form a set of orthogonal 
checks on that coordinate. (They are orthogonal because two lines through a 
point have no other intersection.) By Corollary 16, one-step majority logic 
decoding will correct Xp? + 1) errors, and the code has minimum distance at 
least p? +2. 


Examples. (1) The simplest example is when p =2, s— 1. Then € is the 
(7,3, 4] binary simplex code again, as shown in Fig. 13.7. 

(2) If p = s —2, D is generated by the lines of the projective plane of order 
4, and € is a [21, 11, 6] binary code. 

These codes are closely related to difference sets. 


Definition. A planar difference set modulo n = I(l — 1) + 1 is a set of | numbers 
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d,,..., d, with the property that the I(l — 1) differences d; — d; (i# j), when 
reduced modulo n, are exactly the numbers 1,2,...,n— l in some order. 


Examples. (1) For / —3, the numbers d,=0, d;—1, d,—3 form a planar 
difference set modulo 7. Indeed, the differences modulo 7 are 1—0- lI, 
3—-0-23,3-1-22,0-1-26,0—-3-4,1-3-5. 

(2) (0, 1,3, 9} is a difference set modulo 13, and (0, 2, 7, 8, 11) is a difference 
set modulo 21. 

The only known planar difference sets are those obtained from a projective 
plane II of order p in the following way: let a',..., a^ be the points of a line 
of II. Then {i,,..., i) is a planar difference set. For suppose two differences 
are equal, say 

i,—i, =i —i„ where izi. 
Then the cyclic shift of this line which sends the point a* into the point a‘ 
gives a new line which meets the first in two points, a contradiction. Thus we 
have proved 


Theorem 22. (Singer.) If a',...,a', [— p^4 l, are the points of a line in a 

projective plane PG(2, p^), then (i,,..., i) is a planar difference set modulo 
2s s 

p< +p +l. 


Research Problem (13.3). Are there any other planar difference sets? 


(ID Euclidean and projective geometry codes. Instead of taking 9 to be 
generated by the incidence vectors of the lines of a projective plane, we may 
use the incidence vectors of the r-dimensional flats of a Euclidean geometry 
EG(m, p?) or a projective geometry PG(m, p^). Then € = @* is a Euclidean 
or projective geometry code. There is no simple formula for the dimension of 
these codes. Decoding can be done by r-step majority logic decoding as for 
RM codes. Further details and generalizations can be found in the references 
given in the notes. All of these codes can be regarded as extensions of RM 
codes. 


89. Automorphism groups of the RM codes 


Let A = (aj) be an invertible m x m binary matrix.and let b be a binary 
m-tuple. The transformation 


Ui Ui 
T: replace | : jv a: J (17) 
Um Um / 


is a permutation of the set of 2" m-tuples which sends 0 into b. 
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We may also think of T as permuting Boolean functions: 

T: replace f(v1,..., Um) by (z aio; + bi... , D aso bn). (18) 

The set of all such transformations T forms a group, with composition as 


the group operation. The order of this group is found as follows. The first 
column of A may be chosen in 2" — 1 ways, the second in 2" — 2, the third in 


2" —4,.... Furthermore there are 2" choices for b. So this group, which is 
called the general affine group and is denoted by GA(m), has order 
|GA(m)| = 27Q" — D" - 2" — 2?) - - - (2” — 2"7’). (19) 


A useful approximation to its order is 
|IGA(m)| = 0.29 27*" for m large. 


(We encountered another form of this group in $5 of Ch. 8.) 

It is clear from (18) that if f is a polynomial of degree r, so is Tf. Therefore 
the group GA(m) permutes the codewords of the r'" order RM code &(r, m), 
and 


GA(m)C Aut € (r, m)... (20) 


The subgroup of GA(m) consisting of all transformations 


Ui Ui 
T: replace ( | by af: ) (21) 
Um Um 


(i.e., for which b = 0) is the general linear group GL(m, 2) (see §5 of Ch. 8), 
and has order 
[GL(m,2) = (2” — 1)(2" -2Q" — 2?) + - (2”" —2"7') 
20.292" for m large. (22) 
Since (21) fixes the zero m-tuple, the group GL(m, 2) permutes the code- 
words of the punctured RM code &(r, m)*: 
GL(m, 2) C Aut &(r, m)*. Q3) 


Note that GL(m,2) is doubly transitive and GA(m) is triply transitive 
(Problem 9). 


Theorem 23. For 1 &rm- 1, 
(a) Aut R(r, m)* C Aut R(r + 1, m)* 
(b) Aut R(r, m) C Aut AR(r + 1, m). 


Proof of (b). Let xi,..., xs be the minimum weight vectors of R(r, m). For 
a € Aut R(r, m), let x, = x. Now x; is an (m — r)-flat. If Y is any (m — r — 1)- 


400 Reed-Muiier codes Ch. 13. §9. 


flat, then for some i, j, Y = x; * x, Therefore 
TY = m(x * xj) 
= TX; TX = Xo * Xy, 
which is the intersection of two (m — r)-flats,; and contains 2" "'' points since 
a is a permutation. Thus rY is an (m—r-— l)-flat So v permutes the 


generators of R(r+ 1, m), and therefore preserves the whole code. Part (a) is 
proved in the same way. Q.E.D. 


It is immediate from Problem 5 that 
Aut R(r, m)* = 4.4, forr=Oandm—1, 
Aut &(r, m) = f£ forr20,m — 1l and m. 


In the remaining cases we show that equality holds in (20) and (23). 


Theorem 24. For 1<r=m-—2, 
(a) Aut R(r, m)* = GL(m, 2), 
(b) Aut R(r, m) = GA(m). 


Proof. (i) We have 


puncture 0 
coordinate 


simplex code 3e; ———, (1, m)* «———— — _ (1, m). 


By Problem 30 of Ch. 8, Aut X; = Aut (1, m)*. From (23), and the remark 
following Theorem 13 of Ch. 8, since #,, has dimension m, 


Aut Xn -: Aut A(1, m)* = GL(m, 2). 


Finally, by Problem 29 of Ch. 8, Aut Xn =GL(m, 2). 

(ii) Let G, = Aut ACI, m)*, G; = Aut R(1, m). Clearly G, is the subgroup of 
G- which fixes the 0 coordinate. Since GA(m) is transitive, so is G;. Each 
coset of G, in G, sends 0 to a different point, so |G.|.= 2"|G,|. Therefore from 
(19) and (22) G: =GA(m). Again by Problem 29 of Ch. 8, Aut &(m —2, m) = 
Aut R(1, m) = GA(m). 

(iii) From Theorem 23 and (1), (ii), 


GA(m) = Aut 2(1, m) C Aut 2(2, m) C: - - C Aut 2(m —2, m) = GA(m) 


GL(m,2) = Aut 2(1, m)* C Aut 2(2, m)* C--- C Aut 2(m —2, m)* 
= GL(m, 2). 
Q.E.D. 


Problem. (9) Show that GL(m,2), GA(m) are in fact groups of the stated 
orders, and are respectively doubly and triply transitive. 
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*$10. Mattson-Solomon polynomials of RM codes 


In this section we shall show that the Boolean function defining a code- 
word of an RM code is really the same as the Mattson-Solomon polynomial 
(Ch. 8) of the codeword. 

Let a be a primitive element of GF(2”). Then 1,o,..., o" ' is a basis for 
GFQ"). Let Ao,..-, À4.4. be the complementary basis (Ch. 4). We shall now 
consider RM codes to be defined by truth tables in which the columns are 
taken in the order 0, 1, æ, a@?,..., 0?" ?. For example, when m =3, the truth 
table is shown in Fig. 13.12. 


0 1 a a? a? at a? at 





Fig. 13.12. 


Lemma 25. The MS polynomial of the codeword v; in (1, m) is T,,(Ajz), where 
T,, is the trace function defined in 88 of Ch. 4. 


Proof. Let M be the matrix consisting of the rows »,,,..., vi of the truth table, 
with the first, or zero, column, deleted. M is an m X2" — 1 matrix (/6,), and 


a'- Ma" **  0xix2"-2. 
k=O 
We must show that 
T. (Aa) = A, cias 
In fact, 
: m-í 
T, (Aja) = D, Mai T, (uo) = Man ius 
k=0 
by the property of a complementary basis. Q.E.D. 


Corollary 26. If a is any binary vector of length 2", corresponding to the 
Boolean function a(vi,. .., Um), then the MS polynomial of a is 


A(z) = a(T.(Aoz), ..., Tin(Am-1Z)). Q4) 


Notes. (i) A(0) = 27257 A(a@‘) = a(0,...,0) is an overall parity check on a. 
(ii) When evaluating the RHS of (24), high powers of z are reduced by 
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z?"-'= 1. However, “nce A(z) has been obtained, in order to use the proper- 
ties of MS polynomials given in Ch. 8, A(z) must be considered as a 
polynomial in Z[z], F = GF(2"). 
Conversely, if a is the vector with MS polynomial A(2), the B.f. cor- 
responding to a is 
A(vi + v;a t c Vma’). 


Example. Suppose m = 3 and the codeword is 67, so that the MS polynomial is 
A(z) 7» z t z? - z*. Then the corresponding B.f. is 


A(v, + va + va?) = vi + v, Ta) + v,T«(a) 
= v, + (v+ v)(a +æ?’ +a’) =n, 


in agreement with Fig. 13.12. 


Problem. (10) Repeat for the codeword 6%. 


*§11. The action of the general affine group on Mattson-Solomon polynomials 


Definition. An affine polynomial is a linearized polynomial plus a constant (see 
§9 of Ch. 4), i.e. has the form 


m-l 
F(z) = yot > yz^, where y, € GF(2”). 
f=0 


Problem. (11) Show that those zeros of an affine polynomial which lie in 
GF”) form an r-flat in EG(m, 2). 


The main result of this section is the following: 


Theorem 27. A transformation of the general affine group GA(m) acts on the 
MS polynomial A(z) of a vector of length 2" by replacing z by F(z), where 
F(z) is an affine polynomial with exactly one zero in GF(2"). Conversely, any 
such transformation of MS polynomials arises from a transformation of 
GA(m). 


Proof. Consider any transformation belonging to GA(m), say 


v” Av? +b, 
where 
v —(0,...,0,)  (T«(Ao2), ..., T. (A12). 
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The variable z in the MS polynomial is related to v by 
z=v,tvat+---+v,0" a:o, 


where æ —(1,0,...,0o" '). Thus z is transformed as follows: 


z— «Av! t a : b. Q5) 
Now m-1 f 
Tm (Ào)zZ Z (uz) 
aa : | =aA : 
Tin (Am—12) * (A az) 


2m-! 


= («AAT)z + (@A(A?)")z2 +- t (aAA), 


where A” = (A2,..., Až) and the T denotes transpose. 
Let «A = B = (Bo, ..., Bm-1) be a new basis for GF(2"). Then 


z>a-b+(B-A™)z+(B° ADD + +B (Q0 7y»)z7, (26) 


which is an affine polynomial. 


Problem. (12) Show that the polynomial (26) has exactly one zero in GF(2"). 
Conversely, any transformation 


z> F(z)-us*f(z) Uo € GF”), (27) 


when f(z) is a linearized polynomial and F(z) has exactly one zero in GF(2"), 
is in GA(m). We decompose (27) into 


Z>Z+ Uo, 


which is clearly in GA(m), followed by 
z>f(z)= BY yz”. (28) 
Problem. (13) Show that (28) is in GL(m, 2), i.e. has the form 
2> 5 B AP” 


for some basis Bo,..., Bm-ı of GF(2"). 


Notes to Chapter 13. 


$1. Reed-Muller codes are named after Reed [1104] and Muller [975] 
(although Peterson and Weldon [1040, p. 141] attribute their discovery to an 
earlier, unpublished paper of Mitani [963]). 


404 Reed-Muller codes Ch. 13. 


Nonbinary RM codes have been defined by several authors - see Delsarte 
et al. [365], Kasami, Lin and Peterson [739-741], Massey et al. [924] and 
Weldon [1401]. See also [99, 1396]. 


$2. Truth tables are widely used in switching theory (see for example 
McCluskey [935, Ch. 3] and in elementary logic, where they are usually 
written with FALSE instead of 0 and TRUE instead of 1 (see for example 
Kemeny et al. [755, Ch. 1]). In either form they are of great importance in 
discrete mathematics. For the disjunctive normal form see Harrison [606, p. 
59] or McCluskey [935, p. 78]. 


§3. Problem 6 is due to S.M. Reddy. 
$4. Lemma 6 is from Rothschild and Van Lint [1128]. 


§5. The proof of Theorem 10 is taken from Delsarte [343]. 

Lucas’ theorem for multinomial coefficients was used in the proof of 
Theorem 10. This is a straightforward generalization of the usual Lucas 
theorem for binomial coefficients, which in the binary case is as follows. 


Theorem 28. (Lucas [862].) Let the binary expansions of n, k and | ^ n — k be 
n-€En2, k= Dk 1-242, 
where n, k;, l; are 0 or 1. Then 


(t) = { 1 (mod 2) iff kk; n foralli 
k) l0(mod2)if k > n, for some i. 


Equivalently, 


ni fl (mod 2) iff k * ln for alli 
(5) B lo (mod 2) iff k, - >n; for some i 

For a proof see for example Berlekamp [113, p. 113]. Singmaster [1216] gives 
generalizations. 

That RM codes are extended cyclic codes (Theorem 11) was simul- 
taneously discovered by Kasami, Lin and Peterson [739, 740] and Kolesnik 
and Mironchikov (see [774] and the references given there). See also Camion 
[237]. 

Theorem 13 is from MacWilliams and Mann [884]. The ranks of the 
incidence matrices of subspaces of other dimensions have been determined 
by Goethals and Delsarte [499], Hamada [591], and Smith [1243-1246]. 


$6. The Reed decoding algorithm was given by Reed [1104], and was the first 
nontrivial majority logic decoding algorithm. For more about this algorithm 
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see Gore [544] and Green and San Souci [555]. Massey [918] (see also [921]) 
and Kolesnik and Mironchikov (described in Dobrushin [378]) extensively 
studied majority logic and threshold decoding. Rudolph [1130, 1131] in- 
troduced one-step weighted majority decoding, and this work was extended 
by Chow [294], Duc [387], Gore [544, 545], Ng [990] and Rudolph and Robbins 
[1134]. 

Techniques for speeding up the decoding of Reed-Muller codes were given 
by Weldon [1403]. See also Peterson and Weldon [1040, Ch. 10]. Decoding 
Reed-Muller and other codes using a general purpose computer has been 
investigated by Paschburg et al. [1025, 1026]. 

Duc [388] has given conditions on a code which must be satisfied if L-step 
decoding can correct [X(d — 1)] errors. See also Kugurakov [786]. 

The Reed decoding algorithm will correct many error patterns of weight 
greater than [id —1)] Krichevskii [784] has investigated just how many. 
Theorems 17 and 19 are given by Lin [835-837]. 

Other papers on majority-logic decoding are Berman and Yudanina [137], 
Chen [271], Delsarte [348], Duc and Skattebol [390], Dyn'kin and Tene- 
gol'ts [397], Kasami and Lin [732, 733], Kladov [763], Kolesnik [773], Lon- 
gobardi et al. [860], Redinbo [1101], Shiva and Tavares [1205], Smith [1245] 
and Warren [1390]. 


$7. Decoding by sequential code reduction was introduced by Rudolph and 
Hartmann [1132]. Some applications are given in [1112]. Meggitt decoders 
were described in [952]. Theorem 21 is due to Rudolph [1131]. Rudolph and 
Robbins [1134] show that the same result holds using only positive weights a; 
in (15). Longobardi et al. [860] and Robbins [1115] have found efficient 
threshold decoding circuits for certain codes. ' 


§8. Difference-set cyclic codes were introduced by Weldon [1400]. Their 
dimension was found by Graham and Mac Williams [553]. For Theorem 22 see 
Singer [1213]. 

For more about difference sets see Baumert [82], Hail [589], Mann [908] 
and Raghavarao [1085, Ch. 10]. 

Euclidean and projective geometry codes were studied by Prange [1074, 
1075], but the first published discussion was given by Rudolph [1130]. There is 
now an extensive literature on these codes and their generalizations — see for 
example Chen [267—270], Chen et al. [273, 274, 276], Cooper and Gore [307], 
Delsarte [343], Goethals [491, 494], Goethals and Delsarte [499], Gore and 
Cooper [548], Hartmann et al. [611], Kasami et al. (732, 740, 741], Lin et al. 
[833, 835-837, 840], Rao [1092], Smith [1246] and Weldon [1402]. 


$9. The class of codes which are invariant under the general affine group has 
been studied by Kasami, Lin and Peterson [738] and Delsarte [344]. 


$11. Problem 11 is from Berlekamp [113, Ch. 11]. 


First-order Reed—Muller 
codes 


§1. Introduction 


In this chapter we continue the study of Reed-Muller codes begun in the 
previous chapter, concentrating on the first-order codes R(1, m). Under the 
name of pseudo-noise or PN sequences, the codewords of &(1, m), or more 
precisely, of the simplex code Fm, are widely used in range-finding, 
synchronizing, modulation, and scrambling, and in $2 we describe their 
properties. In $3 the difficult (and essentially unsolved) problem of classifying 
the cosets of R(1,m) is investigated. Then $4 describes the encoding and 
decoding of R(1,m). This family of codes is one of the few -for which 
maximum likelihood decoding is practical, using the Green Machine decoder. 
(One of these codes was used to transmit pictures from Mars.) The final 
section deals with cosets of greatest minimum weight. These correspond to 
Boolean functions which are least like linear functions, the so-called bent 
functions. 


$2. Pseudo-noise sequences 


The codewords (except for 0 and 1) of the cyclic [2" — 1, m, 2"^!] simplex 
code Sm or the [2", m - 1,2"^'] extended cyclic first-order Reed-Muller code 
9t(1, m) resemble random sequences of 0’s and 1’s (Fig. 14.1). In fact we shall 
see that if c is any nonzero codeword of Sm, then c has many of the properties 
that we would expect from a sequence obtained by tossing a fair coin 2" — 1 
times. For example, the number of 0’s and the number of 1’s in c are as nearly 
equal as they can be. Also, define a run to be maximal string of consecutive 
identical symbols. Then one half of the runs in c have length 1, one quarter 
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—_—— O- O O-o 
-_- OoO_- OO — OS 
om OO KK CO 
_ O O- - = OO 
oo... Omo 
O — = = OK OO 
—_ m = Om O OC © 


N 


Fig. 14.1. Codewords of the [7, 3, 4] cyclic simplex code. 
have length 2, one eighth have length 3, and so on. In each case the number of 
runs of 0 is equal to the number of runs of 1. Perhaps the most important 
property of c is that its auto-correlation function is given by 


p(0) = 1, p(t) =- a fo 1<7r<2"-2 


(see Fig. 14.4). 

This randomness makes these codewords very useful in a number of 
applications, such as range-finding, synchronizing, modulation, scrambling, 
etc. 

Of course the codewords are not really random, and one way this shows up 
is that the properties we have mentioned hold for every nonzero codeword in 
a simplex code, whereas in a coin-tossing experiment there would be some 
variation from sequence to sequence. (For this reason these codewords are 
unsuitable for serious encryption.) 

These codewords can be generated by shift registers, and we now describe 
how this is done and give their properties. 

Let 


h(x)2x"th,ax" + thx +1 


be a primitive irreducible polynomial of degree m over GF(2) (see Ch. 4). As 
in §3 of Ch. 8, h(x) is the check polynomial of the [2" — 1, m, 2"'] simplex 
code Sm. We construct a linear feedback shift register whose feedback 
connections are defined by h(x), as in Fig. 142. 





ao 


Fig. 14.2. Feedback shift register defined by h(x). 
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OUTPUT SEQUENCE 





'**:1111010110010001 
A es 





[o] [*] [e] 1 
1 [e] [e] [e] PERIOD = 15 
[*] 1 [*] [*] 
[*] [*] 1 [*] 
1 [*] [e] 4 
1 1 [*] [*] 
[*] 1 1 [*] 

STATES 1 o 1 1 
(0) 4 [*] 1 
1 [*] 1 [*] 
1 1 [*] 1 
1 1 1 [*] 
1 1 1 4 
(0) 1 1 1 
Q [*] 1 1 
(0) (0) [*] 1 

( REPEATS) 


Fig. 14.3. Shift register corresponding to h(x) = x*-- x - 1, showing successive states. 


For example, if h(x) — x*-- x +1, the shift register is shown in Fig. 14.3. 

Suppose the initial contents (or state) of the shift register are 
1 50) do as in Fig. 14.2. The output is taken from the right-hand end of 
the register, and is the infinite binary sequence a = aoa,a2.... 


Definition. For any nonzero initial state, the output a is called a pseudo-noise 
(or PN) sequence. (These sequences are also called pseudo-random 
sequences, m-sequences, or ‘maximal length feedback shift register 
sequences.) An example is shown in Fig. 14.3, which gives the successive 
states of the shift register if the initial state is 0001. The output sequence is 
the 4th column, i.e., 


8—4$90r:: — 100 010 011 010 111,100... (1) 
having period 15. 


Properties of a PN sequence. 


Property I. A PN sequence a satisfies the recurrence 
à, = Ay tH + Atma hy + dicus 


forl=m,m+1,.... 
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Property II. For some initial state as... âm- the sequence a is periodic with 


period n =2" —1, ie. ans; = a; for alli 20, and n is the smallest number with 
this property. 


Proof. Observe that for any a, Property (I) implies 


Üi mai 01 0 ...0 Üi-m 
Qi-m 00 1 ...0 i m-t 
aia 000. 1 aı-2 
a, lh, ho. hs ai-; 


Call this m x m matrix U. Then 
Qi—m+1 Gi m Qi-m-1 
| : -o| J-e : E 
ai i-i Q;-2 


ao 
= Uv 4 } 


m-i: 


Now U is the companion matrix of h(x) (see $3 of Ch. 4), and by problem 20 
of Ch. 4, n 22"— 1 is the smallest number such that U” = I. Therefore there 
is a vector b = (do... am-ı) such that U"b* = b", and U'b" z b? for 1<i< 
n — 1. If the initial state is taken to be b, then a has period 2" — 1. Q.E.D. 


Property III. With this initial state, the shift register goes through all possible 
2" — ] nonzero states before repeating. 


Proof. Obvious because the period is 2" — 1. Q.E.D. 


Note that the zero state doesn't occur, unless a is identically zero. Also 2" — 1 
is the maximum possible period for an m-stage linear shift register. 


Property IV. For any nonzero initial state, the output sequence has period 
2" — 1, and in fact is obtained by dropping some of the initial digits from a. 


Proof. This follows from (III). Q.E.D. 
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Now let us examine the properties of any segment 
C= Ai... Qin 


of length n = 2" —1 from a. 
Property V. c belongs to the simplex code ¥, with check polynomial h(x). 


Proof. c is clearly in the code with parity check polynomial h(x), since c 
satisfies Equation (12) of Ch. 7. Since h(x) is a primitive polynomial, this is a 
simplex code (by $3 of Ch. 8). 


Property VI. (The shift-and-add property.) The sum of any segment c with a 
cyclic shift of itself is another cyclic shift of c. 


Proof. Immediate from V. Q.E.D. 


Problems. (1) (The window property.) Show that if a window of width m is 
slid along a PN sequence then each of the 2" — 1 nonzero binary m-tuples is 
seen exactly once in a period. 

(2) Consider an m-stage shift register defined by h(x) (as in Fig. 14.2), 
where h(x) is not necessarily a primitive polynomial. If the initial state is 
à,71,a,—-:::—-404,.—0, show that the period of a is p, where p is the 
smallest positive integer for which h(x) divides x’ — 1. 

(3) If h(x) is irreducible, but not necessarily primitive, then p divides 
2" —1. 


(4) If a=a.a,...is a PN sequence, then so is b —a,ajd5...if j is 
relatively prime to 2" — 1. 
(5) Some shift of a, i.e. b = a,0,.10,.2 + = bobib;... (say) has the property 


that b; = by for all i. (b just consists of repetitions of the idempotent of the 
simplex code - see $4 of Ch. 8.) 


Pseudo-randomness properties. 


Property VII. In any segment c there are 2"! l's and 2"! — 1 0’s. 
Proof. From (V). Q.E.D. 


Property VIII. In c, one half of the runs have length 1, one quarter have length 
2, one eighth have length 3, and so on, as long as these fractions give integral 
numbers of runs. In each case the number of runs of 0’s is equal to the number 
of runs of ls. 
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Problem. (6) Prove this. 


Autocorrelation function. We come now to the most important property, the 
autocorrelation function. The autocorrelation function p(r) of an infinite real 
or complex sequence s,5,5;... Of period n is defined by 


n-! 
p(r)=2 5 Sig torr 20,2120.) :. (2) 
j=0 
where the bar denotes complex conjugation. This is a periodic function: 
p(t) = p(t +n). The autocorrelation function of a binary sequence asa, ... of 
period n is then defined to be the autocorrelation function of the real 
sequence (— 1)*, (— 1)", ... obtained by replacing 1’s by — 1’s and 0’s by +1’s. 
Thus 


1 n-i uu 
p(r)=— D (= re G) 
i Men 

Alternatively, let A be the number of places where ao... a,-: and the cyclic 
shift 4.4.41... Qr+n—-1 agree, and D the number of places where they disagree 

(so A+ D= n). Then 

A-D 

Pn) (4) 


For example, the PN sequence’ (1) has autocorrelation function p(0) = 1, 


p(t) =— for 1<7<14, as shown in Fig. 14.4. 


Property IX. The autocorrelation function of a PN sequence of period n= 
2" —1 is given by 
p(-1 


p) -i forlsT*x2"-2. (5) 


po 






1 





io 


“15 


Fig. 14.4. Autocorrelation function of a PN sequence. 
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Proof. From (4) 
n — 2d 
n 


(6) 





p(r)- 


where d = dist (ao . . . an-1, A, . . - Gran-1) = Wt (do... Gein-1) for some ø, by (VI). 
The result then follows from (V) and (VII). Q.E.D. 


Problems. (7) Show that (5) is the best possible autocorrelation function of 
any binary sequence of period n = 2” — 1, in the sense of minimizing max p(i). 


Ocicn 
(8) (A test to distinguish a PN sequence from a coin-tossing sequence.) 
Let €»,..., €. ; be N consecutive binary digits from a PN sequence of 
period 2" — 1, where N « 2" — 1, and form the matrix 


Co Ci 77^ CN-b 

€i C2 : CN-b41 
M reor oa cases east 

Cyo-1 Co ^*^ Cn-1 


where m « b « iN. Show that the rank of M over GF(2) is less than b. [Hint: 
there are only m linearly independent sequences in ¥,,.] On the other hand, 
show that if co,..., Cn-1 is a segment of a coin-tossing sequence, where each 
c; is O or 1 with probability 5, then the probability that rank (M) « b is at most 
2?^-"-' This is very small if b «iN. 

Thus the question “Is rank (M) = b?" is a test on a small number of digits 
from a PN sequence which shows a departure from true randomness. E.g. if 
m = 11, 2” — 1 = 2047, b «15, the test will fail if applied to any N = 50 
consecutive digits of a PN sequence, whereas the probability that a coin-tossing 
sequence fails is at most 2 ?'. 


$3. Cosets of the first-order Reed-Muller code 


The problem of enumerating the cosets of a first-order Reed-Muller code 
arises in various practical situations. The general problem is unsolved (see 
Research Problem 1.1 of Chapter 1), although the cosets have been 
completely enumerated for n x32. However, there are a few interesting 
things we can say. For example, finding the weight distribution of the coset 
containing v amounts to finding the Hadamard transform of v. Also 
enumerating the cosets of A(1,m) is equivalent to classifying Boolean 
functions modulo linear functions. The properties of the cosets described here 
will also be useful in decoding (see $4). We end this section with a table of 
cosets of A(1, 4). Those cosets of greatest minimum weight are especially 
interesting and will be studied in the final section of the chapter. 
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Notation. As in §2 of the preceding chapter, v = (1,..., vm) denotes a vector 
which ranges over V", and if f(vi,... Vm) is any Boolean function, f is the 
corresponding binary vector of length 2". 

The first-order Reed-Muller code &(1, m) consists of all vectors 


ul + > uv, u=Oor 1, (7) 
i=l 


corresponding to linear Boolean functions. Define the orthogonal code Om to 
be the [2", m, 2"^'] code consisting of the vectors 


> ut,—u-v, we{O, 1). 
i=l 
Thus 
RA, Mm) = Om U (1+ On). 
Suppose that in the codewords of R2 (1, m) we replace 1’s by — I's and 0’s 
by I's. The resulting set of 2"*' real vectors are the coordinates of the vectors 
of a regular figure in 2"-dimensional Euclidean space, called a cross-polytope 


(i.e. a generalized octahedron). For example, when m —2, we obtain the 
4-dimensional cross-polytope (also called a 16-cell) shown in Fig. 14.5. 


441-- 4444 


pa 


---- -—M 





Fig. 14.5. 4-dimensional cross-polytope, with vertices corresponding to codewords 
of (1,2). 
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This set of real vectors is also called a biorthogonal signal set- see 
Problem 43 of Ch. 1. 

If the same transformation is applied to the codewords of Om, we obtain a 
set of 2" mutually orthogonal real vectors. 

For any vector u = (u;,..., Um) in V", f(u) will denote the value of f at u, 
or equally the component of f in the place corresponding to u. 

It will be convenient to have a name for the real vector obtained from a 
binary vector f by replacing l's by — I's and 0’s by + l's- call it F. Thus the 
component of F in the place corresponding to u is 


F(u) 7 (- 1f? 


Hadamard transforms and cosets of &(1,m). Recall from Ch. 2 that the 
Hadamard transform of a real vector F is given by 


F(u)= >, (= D "FQ, u€ Vv", 


= > (= 1) oom (8) 


vev™ 


F is a real vector of length 2”. Alternatively 


F = FH, 
where H is the 2" x 2” symmetric Hadamard matrix given by 
H = (Hua),  H,.-(-D'"'", u, v E V". (9) 
Consequently 
F= 5: FH, (10) 
or 
Fo)= Y (DF). (11) 
uev" 


Observe from (8) that É(u) is equal to the number of 0’s minus the number of 
1’s in the binary vector 


f^ uv. 
i=t 
Thus 


É(u) = 2" —2 dist fr > uo}, (12) 


or 


dist fs. $ un] = {2" - F(u)}. (13) 
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Also NM 
dist fr. 1+ 2; uo} = ip" + F(u)} (14) 


Now the weight distribution of that coset (of a code €) which contains f gives 
the distances of f from the codewords of €. Therefore we have proved: 


Theorem 1. The weight distribution of that coset of (1, m) which contains f is 
3" +F(u)} for ue V”. 


The weight distribution of the coset containing f is thus determined by the 
Hadamard transform of F. 


Problem. (9) If the coset of Om containing f has weight distribution Ai(f), 
0x i x2", show that the coset of R(1, m) containing f has weight distribution 
A) = Af) + Az), OS i 2". 


Equations 13 and 14 say that the closest codeword of 92(1, m) to f is that 
X uw; for which |F(u)| is largest. 


Example. For m —2, we have 


v=0 011 
v=0101 
Suppose f =0 0 0 1=vmv. 


Then F=1 1 1- (where - stands for — 1). 

The Hadamard transform coefficients of F are, from (8), 

F(00) = F(00)+ F(01) + F(10)+ F(11) 

=14+14+1-1=2, 

F(01) = F(00) — F(01) + F(10)— F(11) = 2, 

F(10) =2, 

Fil) = -2. 
Indeed, 

F = X (111) + 2.x (I1) + 2x (101) 


-2x (1—1)] = 111-, 


verifying (11). 
The code &(1, 2) consists of the vectors (0000, 0101, 0011, 0110, 1111, 1010, 
1100, 1001}. The weight distribution of the coset containing f = 0001 is, 
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according to Theorem 1, 
Af) = 4, Ax(f) = 4, 


which is indeed the case. 
The Hadamard transform coefficients of a (+ 1, —1)- vector F satisfy the 
following orthogonality relation. 


Lemma 2. 
2m if v=0 


X Pw Fu toy= fi if 050 


“uE 


Proof. 
LHS = > $ (-D'"F(w) 2, (= De? FQ) 


uEV™ wEV 


= (- D" *F(w)F(x) b» (-1) er, 


wxevm uevm 
The inner sum is equal to 2"6,.. Therefore 


LHS-2" Y (-D'"F(wy 


=2" Y (-D'", since F(w) = +1, 
wev" 
= 2" 8.o. Q.E.D. 


Corollary 3. (Parseval’s equation.) 
X É(uy = 2?". (15) 
uev™ 


Note that Corollary 3 and Equation (12) imply that the weight distribution 
Af), 0x i <2", of the coset f+ Om satisfies 


> 2” -2iY Af) = 2". 


Boolean functions and cosets of A(1, m). The codewords of 22(1, m) are the 
linear functions (7) of v,,...,v,. Let us say that two Boolean functions 
f(v... Um) and g(v,,..., Um) are equivalent if the difference 


fvi., Um) g(05..., Um) 


is in 2(1,m). If this is so, f and g are in the same coset of 2R(1, m). 
Equivalence classes of Boolean functions under this definition of equivalence 
are in l-1- correspondence with the cosets of R(1, m). 
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Theorem 4. Suppose the Boolean functions f and g are related by 
g(v) = f(Av + B), (16) 


for some invertible m x m binary matrix A and some binary m-tuple B. We say 
that g is obtained from f by an affine transformation. Then the cosets of 
R(1, m) containing f and g have the same weight distribution. 


Proof. From Theorem 1 it is enough to show that the sets [+G(u): u € v") 
and {+ F(u): u € V") are equal. In fact, 


G(u)= 2, (- D *G(o) 


ev 


= $ (-D*''F(Av + B). 


ve vm 


Set v = A 'w-- A`'B, then 
Gu)= Y (-D'^"(- D* ^" Fw) 


=+ > (-D" "F(w) where u'2 uA! 
wev' 


=+ Fu’). Q.E.D. 


Therefore, in order to lump together cosets with the same weight dis- 
tribution, we can introduce a stronger definition of equivalence. Namely we 
define f and g to be equivalent if 


g(v) = f(Av * B) ad +5, ait (17) 


for some binary invertible matrix A, vector B and constants a; Now all 
Boolean functions in the same equivalence class belongs to cosets with the 
same weight distribution. 

However, the cosets containing f and g may have the same weight 
distribution even if f and g are not related as in Equation (17). The first time 
this happens is when n — 32, when for example the cosets containing 


f = Viv + V34 
and 
g — V,V2V4t Vz V4Vs + V203 + 0204 + V305 
both have the weight distribution 
Ai(f) = Ax(f) = 16, Aiw(f) = 32. 


We conclude this section with a table giving the weight distribution of the 
cosets of R(1, 4) (Fig. 14.6). The table gives, for each weight distribution, the 








Typical Weight Distribution 





Boolean 
Number function 012345 6 7 8 910 11 12 13 14 15 16 Remarks 
1 - 1 30 1 The code itself 
16 1234 1 15 15 1 
120 123 1 7 16 7 1 
560 1234 + 12 1 3 12 12 3 1 
840 123 + 14 2 8 12 8 2 
35 12 4 24 4 
448 1234+ 12+ 34 - 6 10 10 6 
28 124-34 16 16 Bent functions 


Fig. 14.6. Cosets of first-order Reed-Muller code of length 16. 





Ch. 14. $4. Encoding and decoding 2 (1, m) 419 


simplest corresponding Boolean function f and the number of cosets having 
this weight distribution. For example, the last line of the table means that there 
are 28 cosets with weight distribution 


A«(f) = Alf) s 16, 


where f = tit; + vv, or is obtained from this by a substitution of the form shown 
in Equation (17). These 28 f's are called bent functions, for reasons which will be 
given in the final section (see also Problem 16). 


$4. Encoding and decoding &(1, m) 


General techniques for encoding and decoding Reed-Muller codes were 
described in Ch. 13. However there are special techniques for first-order 
Reed-Muller codes which are described in this section. 

91, m) is a [27, m *- 1,2" '] code, and so has low rate and can correct 
many errors. It is therefore particularly suited to very noisy channels. For 
example &(1, 5) was succesfully used in the Mariner 9 spacecraft to transmit 
pictures from Mars, one of which is shown in Fig. 14.7. (Each dot in the 
picture was assigned one of 64 levels of greyness and encoded into a 32-bit 
codeword.) 

Encoding is especially simple. We describe the method by giving the 
encoder for &(1, 3), an [8, 4, 4] code. 

A message Hulu, is to be encoded into the codeword 


111 
Qu... x) = (U at (18) 
9X1... X7) = (Uou Uzu) 011 
101 


O- © m. 
mA — O —_ 
O oO m = 


1 
0 
0 
1 


ooo- 


This is accomplished by the circuit shown in Fig. 14.8. The clock circuit in 
Fig. 14.8 goes through the successive states ¢,f.t; = 000, 001, 010, 011, 100,..., 
111, 000, 001, . . . (i.e., counts from 0 to 7). The circuit forms 


Uo + fU, + ust Us 


which, from Equation (18), is the codeword xowxi...x;. Nothing could be 
simpler. 


Decoding. As we saw in $3 of Ch. 1, maximum likelihood decoding requires 
comparing the received vector f with every codeword (7) of A(1, m). I.e., we 
must find the distance from f to every codeword of &(1, m), and then decode 
f as the closest codeword. From Equations (13) and (14) above, this amounts 
to finding the largest component |É(u)|, where F is the Hadamard transform 
of F given by Equation (8). Suppose the largest component is |Ê (ur, . . . , Um)|. 
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Fig. 14.7. Part of the Grand Canyon on Mars. This photograph was transmitted by 
the Mariner 9 spacecraft on 19 January 1972 using the first-order Reed-Muller code 
9t (1, 5). Photograph courtesy of NASA/JPL. 


MESSAGE 






CODEWORD 
RE STIRS 


CLOCK 
Fig. 14.8. Encoder. for R(1, 3). 
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If F(u,, ...,U,) 2 0 we decode f as 
> Ui, 
i-1 
(from (13)), whereas if Fu, ...,Um) «0 we decode f as 


1 + $ Hit. 
ie 


Direct calculation of F = FH,» by multiplying F and H,» would require 
about 2" x 2" — 2?" additions and subtractions. Fortunately there is a faster 
way to obtain F, which is a discrete version of the so-called Fast Fourier 
Transform. This is possible because Həm can be written as a product of 
m 2" X2" matrices, each of which has only two non-zero elements per 
column. Thus only m2" additions and subtractions are needed to evaluate 
Ê = FH. 

In order to explain this decomposition of Hz we must introduce 
Kronecker products. 


Kronecker product of matrices. 


Definition. If A = (aj) is an m X m matrix and B = (bj) is an n X n matrix over 
any field, the Kronecker product of A and B is the mn X mn matrix obtaining 
from A by replacing every entry a; by a,B. This product is written A ® B. 
Symbolically we have 

A 69 B= (ayB). 
For example if 


hewn] 
N 
i 
— 
Co 
— 
SS” 
fe) 
2 
a 
= 
tl 
Pn 
-— 
- 
e. 


1100 
1-00 
001- 
1010 
0101 
h@h=|5 0 0 (20) 
010- 


This shows that in general A® B B (9) A. 


Problem. (10) Prove the following properties of the Kronecker product. 
(i) Associative law: 


A G9 (B 69 C) - (A G9 B) G9 C 
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(ii) Distributive law: 
(A+ B)@C=A@QC+BOC. 
(iii) 
(A ®© BXC ® D) = (AC) ® (BD). Q1) 


Hadamard matrices. Let us define the Hadamard matrix H,» of order 2" 
inductively by 


Hy = H, 6G) Hi, for mz 2. 


H,- is the Sylvester-type Hadamard matrix of Ch. 2 (see Fig. 2.3; it is also the 
matrix given by Equation (9) above provided u and v take on values in V" in 
the right order. 


Theorem 5. (The fast Hadamard transform theorem.) 
Hy. = MMR MS, (22) 
where 
MY = Le OHO L-, 1<i<m, 


and I, is an n Xn unit matrix. 


Proof. By induction on m. For m = 1 the result is obvious. Assume the result 
is true for m. Then for 1<i<m 
Man = Lee 69 H.® | 
= LA LX) H.® he 
=f, 69 Ms, 
and 
ME = H,® LL. 
Therefore 
Mi: VERE Myre? rz (I; 69 M99) VES (h G9 MAH: ® In) 
= HG) (MP... Mf?) by QD, 
= HiG) Hi. by the induction hypothesis, 
= Ho, Q.E.D. 


Example. For m = 2, 
Mi -LG)H, M? = H,® h, 
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(see (19) and (20)), and indeed 


1100 1010 1111 
MeMP=| o0 ti lo-o]7[1i-- ]77- 
001-/\010- Poel 
For m = 3, 
MPV = I, ® Ho, 
M? = I,@ H; 69 h, 
2 = H.@ L. 


Decoding circuit: the Green machine. We now give a decoding circuit for 
R(1,m) which is based on Theorem 5. This circuit is called the Green 
machine after its discoverer R.R. Green. We illustrate the method by des- 
cribing the decoder for &(1, 3). 
Suppose f = fofi + +: f; is the received vector, and let 
F=FF,:-:F; 
=((— 1)*,(- D^,...,(- D). 


We wish to find 
F = FH, 
= (FoF, +++ F;)Ms?M9M3”, from (22). 
Now 


M3? = 


So 
FMY” = (Fo+ Fi, Fo- Fi, Fo + Fs,..., Fo + Fi, Fe- Fo). 


The circuit shown in Fig. 14.9 calculates the components of FM" two at a 
time. The switches are arranged so that after F>- + F, have been read in, the 
two-stage registers on the right contain (Fo — Fi, Fo + F,) and (F;— Fs, E; + Fs) 
respectively. These four quantities are used in the second stage (see Fig. 
14.10). Then F,--- F; are read in, and (F,—F;, F, + F;) and (F,— E;, Fs + F>) 
are formed in the same pair of two-stage registers. 

The second stage calculates 


(Fo + Fi, F—-F,...,F.— F5)M, (23) 
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Fo -Fa| Fo +F3 


Fig. 14.9. First stage of Green machine. 


p d Fo -F1 Fo tF4 Fo -F4 Fo tF| 
F Fo +F4 -F2 +F3| -F> -F3|+F2 -F3| +F2 +F3 


o-F1i 
E e] 












F4 -F& | FatFs | Fa -Fs | Fa *Fs 
— Fg FF; | -Fe -F7 | Fg -F7 |+Fe +F? 






Fig. 14.10. Second stage of Green machine. 


where 


So the product (23) is 
(Fo+ F, + F+ Fy, Fo— Fi + Fa- F; ..., Fa- Fs— Fot F3). 


This product is formed by the circuit shown in Fig. 14.10. The third stage 
calculates 


(Fo F, + F+ Fs,..., Fa- F,— Fo F)MẸ, (24) 
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So the product (24) is 


=(Fo+ FPF, +---+F), Fo- F + F -RF + Fi- F+ - F,...,PFo-Fi- F, 
+ F- Fi + F+ F- F) 


T (EF, Ê, ER ÔF) 
which are desired Hadamard transform components. These are formed by the 
circuit shown in Fig. 14.11. 

Figures 14.9-14.11 together comprise the Green machine. The final stage is 
to find that i for which |É| is largest. Then f is decoded either as the i® 
codeword of (1,3) if É, 70, or as the complement of the i codeword if 
ÔF. «0. 

Note that the Green machine has the useful property that the circuit for 
decoding A(1, m +1) is obtained from that for R(1, m) by adding an extra 
register to the m" stage and then adding one more stage. 





FotFy Fo-F,4 Fot Fy 
-F2- F3 tFo-F3 |+FotFs3 








FatFs Fa-Fs| F4 +F5 
—FetFz|-Fe-F7 | the -F7 


*Fg +F7 














Fig. 14.11. Third stage of Green machine. 
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Problem. (11) Show that MẸ MÌ = MM. (This implies that the order of the 
stages in the decoder may be changed without altering the final output.) 


$5. Bent Functions 


Cosets of a first-order Reed-Muller code with the largest minimum weight 
are especially interesting. When m is even the corresponding Boolean func- 
tions are called bent functions, because they are in some sense furthest away 
from linear functions. In this section we study their properties and give 
several constructions. Bent functions will be used in the next chapter in the 
construction of the nonlinear Kerdock codes. 


Definition. A Boolean function f(vi,..., Um) is called bent if the Hadamard 
transform coefficients F(u) given by Equation (8) are all +27. 


Examples. (1) f(vi, v2) = viv; is a bent function, since the F(u) are all +2 (see 
the example preceding Lemma 2). 
(2) f(Vi, v2, V3, v4) = tito + v3v4 is bent, as shown by the last row of Fig. 14.6. 
Since F(u) is an integer (from Equation (8)) if f(v,,..., vm) is bent then m 
must be even. From now on we assume m is even and 2 2. 


Theorem 6. A bent function f(vı,..., Um) is further away from any linear 
function 
aol + > AV; 
fmt 

than any other Boolean function. More precisely, f(v1,..., vm) is bent iff the 
corresponding vector f has distance 2" x27?" from every codeword of 
(1, m). If f is not bent, f has distance less than 2" ' — 27^" from some 
codeword of 9(1, m). 


Proof. If f is not bent, then the F(u) are not all + 2"?. From Corollary 3, since 
there are 2" summands in Equation (15), some |F(u)| must be bigger than 2””. 
Therefore from Equation (13) or (14), the distance between f and some 
codeword of &(1, m) is less than 2"! — 27771. Q.E.D. 


Theorem 7. f(v,,..., Um) is bent iff the 2" x 2" matrix H whose (u, v) entry is 
(1/27?) F(u + v) is a Hadamard matrix. 


Proof. From Lemma 2. Q.E.D. 
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Note that if f(vi,..., 0m) is bent, then we may write 
Ê 
Aon, (25) 


which defines a Boolean function f(u,..., Um). The Hadamard transform 
coefficients of f (obtained by setting f = f in Equation (8)) are 2""( — 1) = 
x2"? Therefore f is also a bent function! 

Thus there is a natural pairing f «» f of bent functions. 


Problem. (12) Show that f is bent iff the matrix whose (u,v)" entry is 
( — 1)“, for u,v € V", is a Hadamard matrix. 


Theorem 8. If f(vi,..., Vm) is bent and m 72, then deg f «im. 


Proof. Suppose f is bent and m > 2. The proof uses the expansion of f given 
by Theorem 1 of Ch. 13, and requires a lemma. 

Let F(u) 2 ( — 1), let F(u) be the Hadamard transform of F(u) given by 
Equation (8), and let f(u) be as in Equation (25). 


Lemma 9. If € is any [m, k] code, 


Dy, fu) =27 — ih + Dek Y fu), (26) 


where the sums are evaluated as real sums. 


Proof. This is just a restatement of Lemma 2 of Ch. 5. We start from 
1 A 
>, Fu)= 5x 2 Fw, 
uc uc 


and set F(u) = 1 - 2f(u) and F(u) = 2"(1 — 2f(u)). Q.E.D. 


To complete the proof of the theorem, we apply the lemma with 
€*={bEV": bCa}, 
€={bEV":bC 4a}, 
where a is some vector of V" and à is its complement. Then |¢*| = 27^ 
and (26) becomes 


p f(b) = 27er! — 2m annm 2 f(b). (27) 
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Now Theorem | of Ch. 13 states that 
fos... Pm) = D glao vin 
where 
gla)= È f(b) (mod 2). 
Thus g(a) is given by Equation (27). But if wt(a)>3m and m > 2, the RHS 


of (27) is even, and g(a) is zero. Therefore f has degree at most 3m. Q.E.D. 


Problem. (13) Show that f is bent iff for all v0, v € V", the directional 
derivative of f in the direction v, defined by 


f.(x) = fx +v)+ f(x), xev", 


takes the values 0 and 1 equally often. 


We shall now construct some families of bent functions. 


Theorem 10. 

A(t, .. uu Vine +s Un) = f(r, .. uns) (Vi... Un) 
is a bent function (of m + n arguments) iff f is a bent function (of ui... , Um) 
and g is a bent function (of vi, .. . vn). 


Proof. We shall write w € V™*" as w = (u, v) where u € V" and v € V". From 
Equation (8), 


H(w)2 Y (-D"'""*", where t= (r,s), 
ey 


> (— 1)" trey sf(r)eg(s) 
^ 


= F(u)G(v). (28) 


If f and g are bent, then from Equation (28) Ê (w) = + 2*"*” and so A is bent. 
Conversely, suppose A is bent but f is not, so that |F(A)| > 2?" for some A € V". 
Then with w = (A, v), 

+20" = A(w) = É(A)G(o) 


for all v € V". Therefore IG(v)| <2" for all v € V", which is impossible by 
Corollary 3. Q.E.D. 
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Theorem 10 enables us to generalize examples (1) and (2): 


Corollary 11. 
UU; + V3V4t° ++ + Vn—1Vm (29) 
is bent, for any even m z 2. 


It is easy to see that if f(v) is bent then so is any function f(Av + B) 
obtained from / by an affine transformation. 


Problems. (14) Show that if m is even then v,v2+ vit °° t Un-1Um iS bent. 
(15) Use Dickson's theorem (Theorem 4 of the next chapter) to show that 
any quadratic bent function is an affine transformation of (29). 
(16) (a) The bent function viv; * vv, can be represented by the graph 


l 2 





——_____+ 


4 3 


Show that the 28 quadratic bent functions of vı, v2, v3, v4 are: 


|| wi X 


(3 types), (12 types), (12 types), and (1 type). 








(b) Let f = viv; + vva. Show that the vectors of weight 6 in f + 32(1, 4) form 
a 2-(16,6,2) design. (This design is called a biplane—see for example 
Cameron [231]). 

(17) Show that v,v; and viv; + v3v, are essentially the only bent functions of 2 
and 4 arguments. 


A Boolean function f(v) = f(vi,..., Un) is called decomposable if there is a 
binary matrix A such that 


f( Av) = fi(vi, PP vi) + fv, EET Um) 


for some l. Otherwise f is indecomposable. 


Problem. (18) Show that if f(vi,..., Vm) is a bent function of degree im > 3, 
then f is indecomposable. 
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Theorem 12. For any function g(vı,..., Um), the function 


f(us... Ums Vis.» o, Um) = > uit; + (Vis... Um) (30) 


is bent. 


Proof. From Equations (8) and (13), f(ui,...,u,, v, ..., Um) Is bent iff the 
number of vectors (u, v) = (u1,..., Um, Vis..., Um) Which are zeros of 


h = f(uis is Um ues Um) + DAM + y pats (31) 
i=l í-t 

is 22777! € 2". for all A 2 (à, .... Àm), H = (pis... s). Substituting (30) into 

(31) we obtain 
h = g(vi, dud, Vm) + > (v; + Ài)uá + Y Hiti. 
i=1 i=l 
(i) For any v#A, the first sum is not identically zero, and h is a linear 
function of u,,..., Um. Thus there are 2"! choices of ui, ..., Um for which h 


is zero. The total number of zeros of h of this type is 2"^'(2" — 1). 
(ii) Now suppose v = A. Then 


h = g(vi,..., Um) + Y, Mids 
íi 


which is either 0 or 1, independent of u,,..., Um. If itis 1, the total number of 
zeros of h is 27" '—2""', But if it is 0, there are 2" additional zeros (u 


arbitrary, v = A), for a total of 27^ 4 2". Q.E.D. 


Problems. (19) Let 


m 
fiui, LLL] Um; v1, ttt s, Vm) = ` Hiti. 


i-1 


Show that fi, fit wuiuiws, fit uuuiuuas...,and fit ucc: Um are m-l 
inequivalent bent functions, of degrees 2,3,...,m. 
(20) Show that 


fis... p Um Vig ey 04) — qi(G, c Um OE + E(Vi Vm) 


is bent, where g(v) is arbitrary and g(u) =(¢,(4),...,@m(U)) is a 1-I-mapping 
of V" onto itself. 


Thus we have constructed a number of bent functions, and many others 
will be found in the references given in the Notes. But the following problem 
is unsolved: 


Research Problem (14.1). Classify all bent functions of m variables. 
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Notes on Chapter 14 


§2. There is an extensive body of literature on pseudo-noise sequences. See 
for example Ball et al. [62], Balza et al. [63], Briggs and Godfrey [198], 
Brusilovskii [207], Carter [251], Chang [262], Chang and Ho [263], Cumming 
[320], Davies [336], Duvall and Kibler [395], Fillmore and Marx [429], 
Fredricsson [455], Godfrey [487], Gold [515-516], Goldstein and Zierler [521], 
Harvey [621], Hurd [677], Kotov [780], Mayo [931], Parsley and Sarwate 
[1024], Roberts and Davis [1116], Scholefield 11167], Sidel'nikov [1207], Tretter 
[1339], Tsao [1341], Weathers et al. [1392], Weng [1407], Willett [1416], [1417] 
and especially Golomb [522,523], Kautz [750], Selmer [1178] and Zierler 
[1461, 1465]. 

For applications of PN sequences to range-finding see Evans and Hagfors 
[412], Golomb [522, Ch. 6], Gorenstein and Weiss [551], Pettengill [1041]; to 
synchronizing see Burrowes [213], Golomb [522, Ch. 8], Kilgus [760], and 
Stiffler [1280]; to modulation see Golomb [522, Ch. 5]; and to scrambling see 
Feistel et al. [423], Henriksson [641], Nakamura and Iwadare [984] and Savage 
[1148]. The unsuitability of PN sequences for encryption is mentioned for 
example by Geffe [469, 470]. 

Sequences of period 2" (rather than 2" — 1) can bé obtained from nonlinear 
shift registers, and are called de Bruijn sequences. See de Bruijn [206], 
Fredricksen et al. [451-454] and Knuth [772, p. 379]. 

If So,...,S,-1 is a sequence of length n, its periodic autocorrelation 
function is defined by (2). Its aperiodic autocorrelation function is given by 


n-z-l 


e(r)- X sS, Osrxn-l. (32) 
i-0 


The problem of finding binary sequences with small aperiodic autocorrelation 
function is much harder - see for example Barker [68], Barton and Mallows 
[73], Caprio [241], Golay [511, 513, 514], Golomb and Scholtz [530], Kruskal 
[785], Lindner [841], MacWilliams [875], Moharir et al. [968-970], Schneider 
and Orr [1157], Simmons [1212], Turyn [1343, 1344], Turyn and Storer [1348] 
and Tseng and Liu [1349]. Problems 7 and 8 are from Golomb (522, p. 48] and 
Gilbert [480]. 


$3. For geometric properties of cross-polytopes see Coxeter [313]. Lechner 
[797-799] has shown that the number of equivalence classes of Boolean 
functions (with equivalence defined by Equation (17)) is as follows: 


m 12345 
number 12 3 8 48. 


Research Problem (14.2). Find these numbers when m = 6. How fast do they 
grow with m? 
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The cosets of &(1, 5) have been classified by Berlekamp and Welch [133]. 
Figure 14.6 is based on Sloane and Dick [1233]. See also Hobbs [656], Holmes 
[660], Kurshan and Sloane [789], and Sarwate [1144]. Berlekamp [119] applied 
Theorem 1 to study the effect of errors in the transformed vector F on the 
reconstructed vector F. This problem arises when Hadamard transforms are 
used to transmit pictures. For an application of these cosets in spectroscopy 
see Tai et al. [1298a]. 

The cosets of a certain [n?,2n —1,n] code arise in studying the Berle- 
kamp-Gale switching game (Brown and Spencer [204], Gordon and Witsen- 
hausen [542]). 


$4. A good reference for the encoding and decoding of A(1, m) and its use in 
the Mariner '€? spacecraft is Posner [1071]. Detailed descriptions of the 
encoder and decoder are given by Duffy [392] and Green [557, 558]. For the 
theory of fast Hadamard and Fourier transforms see Bergland [108], Brigham 
[199], Cairns [225], Gentleman [472], Green [557, 558], Nicholson [991], Rush- 
forth [1135], Shanks [1187] and Welch [1398]. For the Kronecker product of 
matrices see Bellman [100] or Marcus [914]. 

Clark and Davis [295] describe how the Green machine can be modified to 
apply to other codes. 

A coset of A(1,5) rather than A(1, 5) itself was actually used on the 
Mariner spacecraft, to facilitate synchronizing the received codewords. (See 
Posner [1071], or Baumert and Rumsey [90, 91].) 


$5. The name "bent function" and most of the results of $5 are due to 
Rothaus [1127]. Dillon [376] (see also [377]) gives an excellent survey, and 
constructs several other families of bent functions. He shows that bent 
functions are equivalent to elementary Hadamard difference sets. See also 
McFarland [950]. 


Second-order Reed—Muller, 
Kerdock and Preparata codes 


$1. Introduction 


The key idea in this chapter is that the second-order RM code &(2, m) isa 
union of cosets of the first-order RM code &(1, m), and that these cosets are 
in 1-1 correspondence with symplectic forms (Theorem 1). A symplectic form 
is a Boolean function of 2m variables u1,..., Um, U1,.-- Um given by X; uByv; 
where B =(B,) is a symmetric binary matrix with zero diagonal. The rank of 
the matrix is called the rank of the symplectic form, or of the coset, and 
uniquely determines the weight distribution of the coset (Theorem 5). Inm $2 
we count symplectic forms by rank, and hence find the weight distribution of 
R(2, m) (Theorem 8). The problem of finding the weight distribution of a 
general RM code is unsolved, although in $3 we do give, without proof, two 
general results. First, Kasami and Tokura's formula for the number of 
codewords of weight «2" '*' in R(r, m) (Theorem 11). Second, McEliece’s 
theorem that the weight of any codeword in &(r, m) is divisible by 2" "" 
(Corollary 13). 

In 84 we obtain the weight distribution of certain interesting small sub- 
codes of &(2,m)*, including the dual of the double-error-correcting BCH 
code (Figs. 15.3 and 15.4). 

In $5 we build up maximal sets of symplectic forms with the property that 
the rank of the sum of any two forms in the set is =2d. The corresponding 
subcodes of &(2, m) have several interesting properties. These subcodes are 
linear if m is odd; but for even m they are the nonlinear generalized Kerdock 
codes (m, d). When d ^im, 9$(m,im) is the Kerdock code X(m). The 
first Kerdock code, X(4), is equivalent to the (16, 256, 6) Nordstrom- 
Robinson code Mis given in Ch. 2. Shortly after the discovery of Mie, 
Preparata succeeded in generalizing Mie to get an infinite family of codes 
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P(m) with parameters (2", 277",6), for all even m24. (See Fig. 15.12 
below.) Then in 1972 Kerdock discovered the codes JX(m). These have 
parameters (2", 27", 27^! — 2/"—??). and are a kind of dual to A(m) in the sense 
that the weight (or distance) distribution of #(m) is the MacWilliams trans- 
form (85 of Ch. 5) of the weight distribution of 2(m). (See Fig. 15.6.) The 
codes 9 (m) and H(m) appear to contain at least twice as many codewords as 
the best linear code with the same length and minimum distance. 

The codes (m, d) were discovered by Delsarte and Goethals. In the last 
paragraph we describe the codes .$(m), found by Goethals, which are a kind 
of dual to Z$(m, Xm — 2)). 

The construction of the codes X(m), 9 (m), DE(m, d), ... will require a good 
deal of the algebraic machinery developed in earlier chapters — for example 
the group algebra and annihilator polynomial from Ch. 5 and idempotents and 
Mattson-Solomon polynomials from Ch. 8. 


$82. Weight distribution of the second-order Reed-Muller code 


Codewords of the second-order Reed-Muller code (2, m) of length 
n = 2" are given by the Boolean functions of degree <2 in v = v,,..., Um (see 
$3 of Ch. 13). Thus a typical codeword is given by the Boolean function 


S(v) = > QijVid; + > Lv; + e. 
teism 


l«i«j«m 
We may write this as 
S(v) = vQv7 + Lv? +e, 


= Q(v) + L(v) + e, (1) 
where 
Q(v) = vQv’, 
T Q) 
L(v) = Lv’, 
and Q = (q;) is an upper triangular binary matrix, L = (hl, ...,1,) is a binary 


vector, and e is 0 or 1. Here Q(v) = Q(v,..., Vm) is called a quadratic form 
and L(v)- L(v,...,v.) is a linear form. (A form is a homogeneous 
polynomial.) 

If Q(v) is fixed and the linear function L(v)- e varies over the first-order 
Reed-Muller code &(1, m), then S(v) runs through a coset of A@(1, m) in 
90, m). This coset is characterized by Q. Alternatively we can characterize it 
by the symmetric matrix B = Q+ Q"; B has all diagonal elements zero. 

Associated with the matrix B is another form (u, v), defined by 


Bu, v) = uBv*. (3) 
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Alternatively 
Blu, v) = u(Q+ Q*)v" 
= uQv’™ + vQu™ 
= Q(u + v) + Q(u) + Qw) (4) 
= S(ut+v)+ S(u) + S(v)+e by (1). (5) 


Bu, v) is bilinear, i.e. satisfies (from (3)) 


Blu +v, w) = Blu, w)+ B(v, w), 


Blu, v + w) = Blu, v) + Btu, w). (6) 
From Equation (4), (u, v) is also alternating, i.e. satisfies 
Bu, u)=0 (7) 
and 
Bu, v) = 99 (v, u). (8) 


A binary form which is alternating and bilinear is called symplectic. Similarly 
a binary symmetric matrix with zero diagonal is called a symplectic matrix. 
Thus B is a symplectic matrix. 
01 
B = (10) 


For example the matrix 
is symplectic, and the corresponding symplectic form is 
98 (ui, Ur, vi, V2) = (a, (y) = lv» t Uti. 
10 v2 


As a concrete example, consider the first- and second-order RM codes of 
length 16. A(1, 4) is a [16, 5,8] extended simplex code, and is generated by the 
first five rows of Fig. 13.2. R(2, 4) isa [16, 11, 4] extended Hamming code, and 
is generated by the first 11 rows of that figure. 2(2, 4) consists of 2' ^ = 64 
cosets of A(1, 4). One coset is A(1, 4) itself, and the quadratic and symplectic 
forms corresponding to this coset are zero. 

Another coset consists of the 16 vectors viv: + vsv4 + ACA, 4), four of which 
are shown in Fig. 15.1. 

The Boolean functions S(v) (Equation (1)) corresponding to these four 


0001000100011110-2»vv-tvv 
111011101110000 1= d,0.+ v3v,+ 1 
0100010001001 01 12v; vss v, 
10111011101101002»vv-vvtv*1 


Fig. 15.1. Four vectors of the coset v,v; + vv, + A(I, 4). 
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vectors are 

UU? + V3V4, 

VV: + 04044 d, 

D1 V2 + 0304+ Vi, 

ViV2 + V3V4t v, l. 
(Incidentally these are bent functions - see §5 of Ch. 14.) They all have the 
same quadratic part (Equation (2)), 


Q(v) = vita + viv, 


0100 | |», 
of j| 0000 | |»: 
PrP 2P3Pal 0001 | |v; 
0000 | |v, 
Thus 0100 | 0100 
| 0000 E + _ |1000 
Q=] 991} 8nd B= Q+Q =| ogo; (9) 
0000 [0010 


are the upper triangular and symplectic matrices corresponding to this coset. 
The symplectic form corresponding to this coset (Equation (3)) is 


9(u,v) = uBv" 


= UV + Usd, + Ugg T Usd. 


Problem. (1) Show that if a function (u, v) satisfies Equations (6) and (7) 
then B(u, v) = Q(u + v) + Q(u) + Q(v) for some quadratic form Q(u). 
Thus we have shown: 


Theorem 1. There is a l-l-correspondence between symplectic forms and 
cosets of R(1, m) in &(2, m). The zero form corresponds to R(i, m) itself. 


We shall use the symbol B to refer to both to the coset and to the 
corresponding symplectic form. 

Clearly the number of distinct symplectic forms or matrices is 2 . But we 
also need the number of each rank. 


Theorem 2. Let N (m, r) be the number of symplectic m x m matrices of rank r 
over GF(2). Then 


N(m,2h + 1) 2 60, 
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2h—1 


N(m,2h)- Ec . II (77-1) 


E VC M VAT V uice a 
(27 = DQ?*?- 1) alate (2 — 1) " 


Proof. Note that N(1, 0) = 1, N(1, 1) =0 and N(2, 0) = 1, N(2, 1) =0, N(2, 2) = 
l. 

We shall derive a recursion formula for N(m,r). Let A be a fixed m xm 
symplectic matrix of rank r, and set 


of size (m +1)x(m + 1). 


Lemma 3. Of the 2" different matrices B, 2" — X have rank r +2, and X have 
rank r. 


Proof of Lemma. If y is independent of the rows of A, which can happen in 
2" —2' ways, then (7) has rank r+ 1 and B has rank r+ 2. On the other hand, 
if y is dependent on the rows of A, say y = xA, then (4) has rank r. To show 
that (5) is dependent on the columns of ($), we observe that 


eese) QED. 


From Lemma 3, the recurrence for N (m, r) is 
N(m + 1, D 2'N(m,r) - Q" —275N (m, r —2). 


The initial value N(1,1) — 0 implies N(1,2h + 1) — O0 for all h. It is straight- 
forward to verify, using the initial value N (m, 0) - 1, that the solution of the 


recurrence is 
2"—-DQ"'-1) 


N(m, 2) = P14 ; 
QUOD T-DO = = Doss 
NOCD =e y 95 
and in general is the formula stated in the theorem. Q.E.D. 


We next show that the weight distribution of the coset B depends only on 
the rank of the matrix B. For this we need a fundamental theorem due to 
Dickson, which we state as follows: 





438 Second-order Reed-Muller codes Ch. 15. §2. 


Theorem 4. (Dickson’s theorem.) (1) If B is a symplectic m X m matrix of rank 
2h, then there exists an invertible binary matrix R such that RBR" has zeros 
everywhere except on the two diagonals immediately above and below the main 
diagonal, and there has 1010...100...0 with h ones. 

Examples of RBR" are, when h =2, 


0100 01000 
1000 10000 
0001 00010 
0010 00100 

00000 


(2) Any Boolean function of degree «2, 
vQv? + L(v) ^ €, 
where Q is an upper triangular matrix and L, € are arbitrary, becomes 


T(y- 2 Ya-Ya + L(y) + € 


under the transformation of variables y = vR™', where R is given by Part (1) 
and B = Q+ Q". Moreover y,,..., ya are linearly independent. 

(3) If Ly) is linearly dependent on y,,..., ya, we may by an affine 
transformation of variables write T(y) as 


h 
Y xuaxa + €, €, 7 0 or 1, (10) 
ia 


where X,,...,X2, are linearly independent, and each x; is a linear form in 
Yin... Yom l. 


Proof. (1) The statement is clearly true for m = 1,2. Suppose it is true for 
m «t and h «[t/2]. Then a (t+1)x(t+1) symplectic matrix B may be 


written as 
aA 2) 
8 =(5 0 


where A is of the same type. For example, a 3 x3 symplectic matrix can be 


written as 
0 0 a, 0 l a, 
00 a or 1 0 aj. 
a, a, 0 a, a, 0 
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If rank A <2[t/2], we may, by elementary row and column operations, reduce 
B to 


A y™ 0-0 
yo 0.0 
00 0-.-.0]p. 
00 0 0 
where 
(^ $) 
y 0 


is of size stx t' By induction this may be brought into the form described. 
Thus we may suppose that A has rank 2[¢/2]. If 2[t/2] = t, then B has rank t, 
and 


S 
= 
e 
oo 
e 
= 


0 a 
B-loo 50. rus 
0 0 :1 Oa, 
a, a2 a,-, a, 0 
In this case 
1000.-00 0 
0100 -00 0 
R= ae ilU ea afe 
Az A, A4 Q3 ``’ à Q 1 
If 2[4/2] 2 t—1 then 
0 0 0 Oa, 


0 0 x 1 0 0 ,-, 
0 0 -0 0 Oe 
dı Q2 `°’ Q2 Qer- € 0 


where rank B =t— 1 if e=0 and t+ 1 if e= 1. In either case use 


100 0 0 00 
1 0 0 00 
p E 
d; di d4 à, a, 9 1 


(Check this for m = 4,5.) 
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(2) The transformation x = uR'^!, y = vR™ changes the bilinear form uBv” 
into 
h 
xRBR'?y = > (Xzi-1Yzi + Xaiyzi-i). 
ici 
The corresponding quadratic form is 
h 
> Y2-1Y2i- 
f= 
(3) Let 


h 


T(y) = X yu ys + L(y) + € 


i=t 


where L(y) is linearly dependent on the y, say 
2h 
L(y) = > ly; 
i=l 
For example, if /, = 1, /; — 0, so that 


T(y) = yiy2t Vito, 


the substitution y; — xi, y; 7 X, + x» changes this to 
h 2h 
T(y) 2 xix; 2 Y2i-1Y2i + 2 liyi. 


If h=l=1 then T(y)=yiy2+ yit yat cis changed by the substitution 
y= xX +l, yo=x2+1 into 


h 2h 
T(y) xix + Y Y2i-1Y2i + > liy; + 1. 
1-2 1-23 
Clearly we can continue in this way. Q.E.D. 


Remarks. (1) Since A(1, m) contains the Boolean function 1, the coset B will 
contain both 


h h 
Xonoaxa and Ð Xz-Xz +1. 
i=t i= 
(2) The Mattson-Solomon polynomial corresponding to the canonical form 
h 
` Xzi-1X2i 
i=l 
is, by Corollary 26 of Ch. 13, 
h 
Y T.(oz)T.. (Biz), (11) 
i=l 


where a,,..., 054, B:,..., B, are linearly independent elements of GF(2”). 
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Problem. (2) In part (3) of the theorem, show that the constant e, is given by 
= > bib; + e. 

Hence show that the number of L(y) for which e, = 1 is 2^'—2^*"*, 
We now use the canonical expression for a quadratic form given by 
Theorem 4 to obtain the weight distribution of the corresponding coset, as 


shown in the following theorem. This result will be used many times in this 
chapter. 


Theorem 5. If the matrix B has rank 2h, the weight distribution of the 
corresponding coset B of (1, m) in RO, m) is as follows: 


Weight Number of Vectors 
gmt = Dm-h-1 2?^ 
2m gmc JAk 
Data oman 22^ 


We shall need two lemmas for the proof. 


Lemma 6. The number of values of (xi... , x») for which 


h 
> X2i- 1X25 
[ET 


is zero is 27^ 2^7. 


Proof. If h — 1 this number is 3 (they are 00, 10, 01) so the lemma is true. We 
proceed by induction on h. 


h+l h 
> Xai-1X5j = > X2i-1X2i F Xan+1X2n+2 
iz iz 


=F,+F, (say) 


There are 3(2^'--2"") cases with F,= F,=0, and 2?"' —-2"" cases with 
|= F,= 1. The total number is 


4-2?^-! + 2.277! P JARDI + Dever Q.E.D. 


Therefore the number of vectors (%1,..., X24, Xan«t, +- ©, Xm) for which 


h 
> Xa2i-1X2i 
[E 


is zero is 
Dmoz (2 + 2^7!) “= 2m + Qm-h a 
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Lemma 7. Let 


m 
Larsi = >D ax; 


f=2h+t 


where the a, are not all zero. The number of values of xi,..., x, for which 
h 
> XaaXn + Lua 
i-l 


is zero is 2" '. 
Proof. Use the randomization lemma of Problem 2 of Ch. 13. Q.E.D. 


Proof of Theorem 5. Let B have rank 2h. By Dickson's theorem, the quadratic 
part of any Boolean function in the coset € can be transformed to 


h 
Q(x) = à Xzi—1X2;. 
Let the linear part be L(x). Suppose L(x) has the form 
2h 
L(x) = € ax, 
izi 


which can happen in 2?" ways. Using Part 3 of Dickson's theorem Q(x) + L(x) 
becomes either 


h h 
2, You-tYo OT > Yu-1Ya + I. 
iz iz 


By the remark following Lemma 6 these have weights 2" —2"^"" and 
27^. 2"-*"! respectively. On the other hand if L(x) is not dependent on 
Xy... X2, Which happens in 2"*' — 2?*! ways, by Lemma 7 the codeword has 
weight 2" '. Q.E.D. 


Remark. Theorem 5 shows that the larger h is, the greater is the minimum 
weight in the coset. h = 0 corresponds to 92(1, m) itself. The largest possible 
minimum weight occurs when 2h — m (and so m must be even). The Boolean 
function corresponding to such a coset is a quadratic bent function (see $5 of 
Ch. 14, especially Problem 15). 

We illustrate Theorem 5 by finding the weight distribution of the 
coset vv + 040, AC, 4). The matrix B is given in Equation (9) and has rank 
4. Thus h=2, and the coset has weight distribution A&4— A œ= 16 (in 
agreement with the last row of Fig. 14.6). 

An immediate consequence of Theorem 5 is: 
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Theorem 8. (Weight distribution of second-order Reed-Muller code.) Let A, be 
the number of codewords of weight i in R(2, m). Then A, = 0 unless i 2 2" or 
i22"7' x2"-"* for some h, 0s h «[im]. Also A= Az» — 1 and 


= yan , Q^ — 107 7 —1)- - - (2772877 — 1) 
Agm-iggm-t-n = Qhosn os L) sen E a = 


for ishs[im]. (12) 


There is no simple formula for Az, but of course 


Azm- = 27m > A. 


isu277! 


Proof. From Theorems 2 and 5 Q.E.D. 


Example. The weight distribution of the [32, 16, 8] Reed-Muller code R(2, 5) is 


t A; 
0 or 32 1 
8 or 24 620 
12 or 20 13888 
16 36518 


Remarks. (1) Putting h=1 in (12) gives Az- —$(2" — 1)2"^' — 1) for the 
number of codewords of minimum weight, in agreement with Theorem 9(b) of 


Ch. 13. 
(2) Equation (12) is rather complicated, but can be simplified by writing it in 
terms of Gaussian binomial coefficients. 


Definition. For a real number b # 1, and all nonnegative integers k, the b-ary 
Gaussian binomial coefficients [š] are defined by 


il 


[i] (D — Db — Dess (b**'—1) 


k (b*—1(b*'—1)---(b-1) " 


(Here x is a real number, usually an integer.) For example 


3] (»5)-10(»?-1. ,, 
I-A asta 


There are many similarities between Gaussian binomial coefficients and 


k=1,2,.... (13) 
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ordinary binomial coefficients, as the following problem shows (cf. Problem 
18 of Ch. 1). 


Problem. (3) Properties of Gaussian binomial coefficients. 
. [x x 
e n [i] g (2. 
n] [n 
e) HEARN; 


(c) Define [n] by 
[0] = 1, 
[n] * (b^ — D(b" — D---(b- D, forn=1,2,.... 
Then 


n [n] 


" [c] "iac 


: EH dert 


(y + D + by + b): (y+ br => bi po Heck yE 


k=0 


(b =$ nofio totoon, 
(9) Sepre 


In terms of Gaussian binomial coefficients with b =4, Equation (12) 
becomes 


Another useful property of these coefficients is: 


Theorem 9. The number of distinct (although not necessarily inequivalent) 
[n, k] codes over GF(q) is the q-ary Gaussian binomial coefficient [4] (with 
b= q). 


Proof. The number of ways of choosing k linearly independent vectors is 
(q"— D(q" - a): :- (a^ — q). 


Each such set is the basis for an [n, k] code €. But the number of bases in € 
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is 

(q* - D(q* — 2): (a^ — a). 
Therefore the number of distinct €'s is 


(q"^—1(q"—4):::-(q"^- q^) [n 
(q*-D(q4*-4):- (aa) lil SED: 


This is easy to remember: the number of [n, k] codes is [k]. 





Problems. (4) (Sarwate.) Let W(x, y) be the weight enumerator of &(2, m), 
and let w(z; m) = W(1, z). Show that 


w(z; m) 2 w(zZ;r'-Dv-z""w(l;m-1) 
-27(27-2)7777 


(Hint: the three terms correspond to the codewords in &(1, m — 1) of weights 
0, 2"! and 2"? respectively.] 


w(z^;m — 2) 


(5) Use the methods of this section to prove the following theorem. Let 
ns =?” —], s>1, L>1, and s is a divisor of 2" +1. Let € be an [n, 271] 
minimal cyclic code. Show that € consists of 0,n codewords of weight 
Q^ + e€(s—1)2""')/s, and n(s—1) codewords of weight (27^! — e2"^9/s, 
where e = (— I). (Hint: Let 8 € GF(2") be a nonzero of €. If (ao''' an-ı) € 
€, then a; = T4,CyB') for some y € GF(2") (Theorem 9 of Ch. 8). First take 
s 2-2' +1, and B = £"*' where £ is a primitive element of GF(2?”). Show that 


Q(&5- Tay") 


is a quadratic form in £, and that the rank of the corresponding symplectic 
form is either 2rl —2r if y — 3"*' for some n € GF(2”"), or 2rl otherwise. 
Secondly, if s is a proper divisor of 2' + 1, divide the group generated by £' 
into cosets of the subgroup generated by £?*'.] 


* 
§3. Weight distribution of arbitrary Reed—Muller codes 


The first-order RM code &(1, m) has weight distribution 
Ao = A = l, Ar~ = 2"*!— 2; (15) 


and the weight distribution of the second-order is given in Theorem 8. Now 
R(m — r — 1, m) is dual to R(r, m), so from Theorem 1 of Ch. 5 the weight 
distributions of A(m —2, m) (the extended Hamming code) and &(m —3, m) 
are also known. But so far there is no formula for any other general class of 
Reed-Muller codes. 
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Research Problem (15.1). Find the weight distribution of 2(3,m),.... 


Nevertheless there are some general results about the weight enumerator 
of R(r, m), which we now state without proof. 

Theorem 8 shows that in R(2, m) with minimum distance d = 2" ^, the only 
weights occurring in the range d to 2d are of the form 


w—-2d-2' for some i. 


This property holds for all RM codes, and in fact Kasami and Tokura have 
found the number of codewords of weight w in R(r, m) for all w «2d =2"7""' 
(actually for all w < 23d — see below). To do this they found a canonical form 
for all the relevant Boolean functions, as follows: 


Theorem 10. Let f(v,,..., Vm) be a Boolean function of degree at most r, where 
r Z2, such that 


wt (f) «2777, 


Then f can be transformed by an affine transformation into either 
(i) f= Dior? Vp-p (Vp-p +t D, + Drei Ursa), 


where y. satisfies 3 y, Sr and y & m — r, or 
(ii) f = Vi ie + vcio it, + VeerDeer o Urin csUre2u 2), 


where u satisfies 2524 «m -—r 2. 
From this result and Theorem 8 Kasami and Tokura [745] proved: 


Theorem 11. (Weight distribution of R(r, m) in the range d to 2d.) Let A, be 
the number of codewords of weight w in R(r, m), where r 22, and suppose 


d —2""* « w « 2d. 


Define a = min (m — r, r) and B — Xm — r +2). Then 

(i) A. =0 unless w= w(u) 22" '*' -277*'^* for some yj, in the range 
1 € u € max (a, B). The case u = 1 corresponds to w = d and is taken care of 
by Theorem 9(b) of Ch. 13. 

(ii) If p =2 or max(a,2) € u s B then 

r*24-3 
2r*uteu-2 Q7 — 1) 
Aww = = ae = eS (16) 
I] (ors ar 1) I] (4i s 1) 
i-0 


i-0 


(iii) If max (B, 2) € u Sa then 
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r+p—t 
2rte?ract (2™ -i 


Awa = ee ee (17) 


He 


I] e7-»[p[e7-: 


(iv) If 3& pu € min (a, B) then A, is equal to the sum of (16) and (17). 


Theorem 11 has been extended to weights < 23d by Kasami, Tokura and 
Azumi [746], but the algebra becomes very complicated and it would seem 
that a different approach is needed to go further. 

The weight enumerators of all Reed-Muller codes of lengths « 256 are now 
known - see Kasami et al. [746], Sarwate [1144], Sugino et al. [1284], and Van 
Tilborg [1325]. The smallest Reed-Muller codes for which the weight dis- 
tributions are not presently known (in 1977) are &(3, 9), A(4,9) and R(5,9). 

The second general result is McEliece's theorem that the weight of every 
codeword in &(r, m) is divisible by 2" ^", This follows from: 


Theorem 12. (McEliece [939, 941].) If € is a binary cyclic code then the weight 
of every codeword in € is divisible by 2'"', where l is the smallest number such 
that | nonzeros of the code (with repetitions allowed) have product 1. 


The proof of this theorem is difficult and is omitted. 


Corollary 13. All weights in R(r, m) are multiples of 


pim- = 2Imin-it 


Proof. From Equation (10) of Ch. 13, o? for 1 « s «2" —2 is a nonzero of the 
punctured cyclic code A(r, m)* iff the binary expansion of s has from 1 to r 
ones in it. Now the product 


iff 
Sit $59: b $ =2"—-1. (18) 


Looking at Equation (18) in binary, we see that each term on the left has at 
most r ones, whereas the RHS contains m ones. From this it follows that the 
smallest / for which (18) has a solution is [m/r]. The result then follows from 
Theorem 12. Q.E.D. 


Example. The weights in &(2, 5) are divisible by 2’, as we saw in the example 
following Theorem 8. 
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*$4. Subcodes of dimension 2m of 912, m)* and R(2, m) 


We begin by proving a simple theorem which makes it easy to find the 
weight distribution of many small subcodes of &(2,m)* and (2, m), by 
finding the rank of the corresponding symplectic forms. 


Theorem 14. Let B be a symplectic matrix of rank 2h. Then the set of binary 
m-tuples v such that uBv? = 0 for all u € V" is a space of dimension m — 2h. 


Proof. Let R be the matrix described in Theorem 4 and set u’=uR™'= 
(ut, ...u4), 0 = R^! = (vi,..., v4). Then uBv* = w'RBR"^v'. 
Now RBR’ is of the form 


01 --- 00 

| [10 -++ 00 

2h 0 

| {00 --- o1 
00 --- 10 





Thus 
u'RBR'v' = uti + ujvi  u$vá - uiti cc 
+ Udn-1V9n + UV n-i. 
This is identically zero for all u' iff 
vp’ =(0- ++ Ouse DA). 


Such v' form a space of dimension m — 2h, Q.E.D. 


Now let S(v) (Equation (1)) be the Boolean function describing a code- 
word c of &(2,m). The components of c are obtained from S(v) by letting 
(v: Vm) run through all binary m-tuples. We can also consider these 


components to be the values of S(£), for €E GF(2"). Define Q(£) for 
é E GF") by 


QC) = £Q£*, 


where £ is the m-tuple corresponding to ¿€ GF(2"7). The values of the 
corresponding symplectic form are, by Equation (4), 


BE, n) = QE + n) + QE) + Ql) (19) 
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We shall find that @(é, n) can usually be written as 
BE, n) = T.(£La(n)), 
where Ls(x) is a linearized polynomial. Then from Theorem 14 
rank B = m — dimensional (kernel Lz), (20) 


where kernel Ls is the subspace of n in GF(2") for which Lg(n) = 0. 


Small subcodes of R(2, m)*. In this section we shall find the weight distri- 
bution of the [2" — 1,2m] cyclic code with idempotent 0f + 6*, where l; = 
1+2', for all i. This is a subcode of R(2, m)*. In particular the code with 
idempotent @* + 6% is the dual of the doubJe-error-correcting BCH code. 

Our method for finding the weight distribution is as follows. First the 
Mattson-Solomon polynomial of a typical codeword is obtained, and from 
this the Boolean function S(£) describing the codeword. This is a quadratic 
since the code is a subcode of &(2,m)*. Then we find the corresponding 
symplectic form &@(é, n), the rank of this form from Equation (20), and use 
Theorem 5 to find the possible weights in the code. Finally the methods of 
Chapter 6 are used to obtain the weight distribution. 

From Equation (8) of Ch. 13 the nonzeros of &(2, m)* can be taken to be 
a’ with wi(s) > m — 3; i.e. those powers of a with exponents belonging to the 
cyclotomic cosets 


Co C. and C_,, for all L =1+2', 


isis[2] 


[m/2] 


Oo + OF + > 0r. 
i 


The idempotent of (2, m)* is 


Case (I). m odd, m = 2t + 1. This case is easy because of the following result, 
whose proof we leave to the reader. 


Problem (6) (a) Show that if m is odd, and 1 € i = t, then 2" — 1 and 2' + 1 are 
relatively prime. (b) Show that 2' + 1 and 2" + 1 are the only odd numbers in 
the cyclotomic coset C;, and that this coset has m elements. 


We shall find the weight distribution of the [2" — 1, 27] subcode €, of 
4 (2, m)* with idempotent 0* + 0*. A nonzero codeword of €; has one of the 
forms x'0t, x'0*, or x'0t - x*O0*, for some integers j and k. The first two 
codewords have weight 2" ', and give us no trouble. 

The third type of codeword is a cyclic shift of the codeword a= 
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x'*0* + 0*. The MS polynomial of a is 
A(z) = * +) z 
tec) rec, 


= Tn (yz) + T,(z?), 


see $6 of Ch. 8. Let S(Z) be the Boolean function describing a. Then from 
Theorem 20 of Ch. 8, 


S(£) = T.(y£) + T,(£"), for all € € GFQ")*. (21) 


The constant term e (see Equation (1)) is e = S(0)=0. Let a' be the vector 
obtained from a by adding an overall parity check (which is zero). 

We use Equation (21) to find the symplectic form corresponding to that 
coset of R(1, m) in RE, m) which contains a’. This is, from (5) and (21), 


BE, m) = Tn y(E + m) + T£ m7) 
+ T.(yD + T,(E77) + T. Cyn) + Tm (7). 
Since 
(E+ 0) = EU + Ent En" +, 
BE, n) becomes 
T(E" + £n") = T.(£q"  * 97), since T(x) = T, (x). 


Next we use Equation (20) to find the rank of B. B(E, n) is zero for all 
values of £ 
iff vn tn” =0 
iff (n +n) =0 
iff n= n” 
iff n E GF2"75n GF") = GF(2’), 


where s = (m — 2i, m) = (i, m) (by Theorem 8 of Ch. 4). Thus the space of 7 
for which (£, 7) = 0 for all has dimension s. From Equation (20), the rank 
of B is m — s. We conclude from Theorem 5 that the codewords of the form 
x'0f + x*0* have weights 2" ' and 2"! + 20"*5-??, 

Finally the dual code €; is a subcode of the Hamming code, so has d' > 3. 
We can therefore use Theorem 2 of Ch. 6 to find the weight distribution of €, 
which is given in Fig. 15.2. Figure 15.2 extends the result of Th. 34 of Ch. 8. 


—2i 








i Ai 

0 l 
2m-t = Dimts—2y/2 Q" -— Do" + DOTEA) 
205 Q"-nDQ"-2"*-«1) 


2m-4 + Dim ts— 2972 Q" "- DQ7-- EN DnS) 


Fig. 15.2. Weight distribution of the code with idempotent 0f + 6#, where l, 2 1+2', s =(i, m), m 
odd 
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Note that €,, with idempotent 8* + 6%, is a [2" — 1, 2m, 2"! —2"7"?] code 
which is the dual of the double-error-correcting BCH code. In this case i = 1, 
s = 1 and the weight distribution is shown in Fig. 15.3. (This was used in $8 of 
Ch. 9 to show that the double-error-correcting BCH code is quasi-perfect.) 


i A; 

0 1 
VEL UE Q" — D"? + 2m»? 
27^ Q"-DQ"'-«10 


om + Zim v2 Q" ZEN bo"? = 2m-n 


Fig. 15.3. Weight distribution of dual of double-error-correcting BCH code of length 2” — 1, m 
odd 


Examples. For n = 31, €, and €; are [31, 10, 12] codes; for n = 127, €., €; and 
€, are [127, 14,56] codes; and for n —2 511, €,, €, and €, are [511, 18, 240] 
codes. The weight distribution of these codes are the same as those given in 
Fig. 8.7. For n = 511, €, is a [511, 18, 224] code with the weight distribution 


Weight Number of Words 


0 1 
224 36.511 
256 449.511 
288 28.511 


Case (II). m even. This case is more difficult because I, = 1 - 2' is frequently 
not prime to 2" — 1. 

We begin by doing a special case, the [2" — 1, 2m, 2" ' - 2"?] code €, with 
idempotent 6* + 6*, which is the dual of the double-error-correcting BCH 


code. 
Now 3 divides 2" — 1 since m is even, so the general form of a codeword of 
€, is aox^0* + b(x)07. This has MS polynomial 


dà (Biz) + 2, (22) 
($6 of Ch. 8). The corresponding Boolean function has values 
SCE) = TL(Bi£ + (Bx£)). for é e GFQ")*, 
from Theorem 20 of Ch. 8. The symplectic form is 


BE, n) = T,(Bx&^n + £n?) 
= T,,(€(B2n? +B? n). 
So we must find the dimension of the space of zeros of the expression 


Bin? + Bx? = (ym yn Y, (22) 
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where y?- 8? **', y*= ?. Now m is a zero of (22) iff ņ is zero or 
quy, Q3) 


How many different y's are there in GF(2")* which satisfy this equation? If w 
is a primitive element of GF(2») C GF(2"), then 


7 (ny = (wn), 


for all n € GF(2"). Conversely, if 


2m-2-1 _ , 2m-?-] 
=m 


n 


then n/n: € GF2"?7)n GF") = GF(2). Therefore there are 2" — 1) y's in 
GF(2")* which satisfy (23). 

If y has the form yn?” *' for some n € GF(2")*, then Equation (22) has four 
zeros in GF(2"), and the rank of B is m —2. Therefore the weight of the 
corresponding codeword is 2"! or 2"^! x 2"?. [f y is not of this form, (22) has 
only the zero 7 — 0, and the codeword has weight 2"^! € 207? 

The minimum distance of the dual code is 5, and again the weight 
distribution of €, may be obtained from Theorem 2 of Ch. 6, and is shown in 
Fig. 154. 


, 


i A; 

0 1 
gmx ns one 52 ONE (90m. 292 + Do" — 1) 
me) = gma 1202-1 mh + Do" = 1) 


ans 2” + 1X2” — 1) 
gm- $ 2mm2-i amena (am 2 1)(2” = 1) 
2m7! + ame 320-0719 moo 1)(2” a 1) 


Fig. 15.4. Weight distribution of the dual of the double-error-correcting BCH code for even m. 


Now we do the general case, the code @ with idempotent 0¥ 0f, 
i, =1+2', where 1 € i «im. (The case i =4m was treated in Ch. 8.) 

The codewords of €, are of one of the forms x‘0*, a(x)0*, or x'0f + a(x) 6%. 
The first of these always has weight 2" ^'; while for the others we obtain a 
symplectic form 


BE, n) = Tm(E(Bn” + B" n). 
Hence we need to find the number of $ such that 
0 = Bn” + Bn?" = (y*n + yn"), (24) 
or equivalently the number of nonzero 7 such that 
2m-?i-1 — 


n =y, (25) 


The number of solutions to (25) is given by Problem 7 below. From this, if 
(m,i)=(m,2i)= s, (24) has 2° solutions in GF(2") for any choice of y. 
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Therefore €, contains just 3 weights, 2"^" and 2"'z2^""7^ 
and the weight distribution is the same as if m were odd and is shown in Fig. 15.2. 

On the other hand, if (m, 2i) = 2(m, i) = 2s, Equation (24) has either 1 or 2* 
solutions depending on the choice of y, and therefore €; contains 5 weights, 
namely 2"^, 277 « 2"*?2?72 and 2"7"-«2"7" By Problem 8 €; has mini- 
mum distance at least 5, and again we can use Theorem 2 of Ch. 6 to obtain 
the weight distribution, which is shown in Fig. 15.5. 


i Ai 

0 I 
a} (2™ — 125 — 127-5 + 1} 
2m-t + Dim +2s -2y2 2(n-asca am => "724 D/Q* + 1) 
P EE AE 2020 — 1)(2? x DJQ* + 1). 


Fig. 15.5. Weight distribution of code with idempotent 6*+ 6*, where l,1-2', 1«i «im, 
(m, 2i) = 2(m, i) = 2s. 


Problems. (7) Let m be even and g arbitrary. Show that the number of 
integers x in the range 0x x x2" —2 which satisfy 


(2"-* — 1)x = (2' — 1)g mod 2” — 1 


(i) 2-1 if (m,2i)) 2 (m,i) 5 s, 
(ii) 2-1 if (m,2i)=2(m,i)=2s and (X +1)|g, 
(iii) 0 if (m,2i) 22(m,i) 225 and (2°+1)/ g. 


(8) Use the Hartmann-Tzeng generalized BCH bound, Problem 24 of Ch. 
7, to show that the minimum distance of €; is at least 5. 

(9) Show that 2" — 1 and 2'+1 are relatively prime iff (m, i) =(m, 2i). If 
(m, 2i) = 2(m, i) then 2'+ 1 and 2" — 1 have a nontrivial common factor. 

(10) Show that dual of the [27,27 —1 —2m,6] extended d.e.c. BCH code 
has parameters 


[2", 2m + 1, 277 — 27", 


*$5. The Kerdock code and generalizations 


In this rather long section we are going to study a special kind of subcode 
of the second-order RM code &(2, m). Let F be the set of symplectic forms 
BE, n) associated with the codewords of this subcode (see Equations (4) and 
(19)). Then F has the property that the rank of every nonzero form in S is at 
least 2d, and the rank of the sum of any two distinct forms in Ẹ is also at 
least 2d. Here d is some fixed number in the range 1 = d = [m/2]. In Ch. 21 we 
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shall see that the maximum size of such a set ¥ is 


EHE if m= 2t + 1, (26) 
2Qt«DG-4€2) if m= 2t + 2. (27) 


Hence the maximum size of the corresponding subcode of #(2, m) is 2'*"|F|. 
In Ch. 21 we shall also find the number of symplectic forms of each rank in a 
maximal size set, and hence the distance distribution of the corresponding 
code. 


For odd m these maximum codes turn out to be linear, but for even m they 
are nonlinear codes which include the Nordstrom-Robinson and Kerdock 
codes as a special case. The material in this section is due to Delsarte and 
Goethals. 

Case (D. m odd, m =2t + 1. The general codeword of R(2, m)* is 
b0, + ax ^0 + S ax*Ot, 1-142. 
j=l 
This has the MS polynomial 
Y Qu + D DS mz), 
s€Ci i=l sec, 


so the corresponding Boolean function and symplectic form are 


S(é)= z T» aye?) 


and 
BE, n) = p T, (y (£n? + €?n)) 
= Tx +i(EL(n)), (28) 
where 
La(n) = 2 (yn? + my"). (29) 


By setting either the first d — 1 or the last d — 1 y,’s equal to zero in (28) we 
get a symplectic form of rank z 2d. 

Clearly the sum of two such symplectic forms also has y; = 0 for the same 
values of i. (This is why the corresponding code is linear.) 


Theorem 15. If y, = y;—5--- = Ya-1 = 0 then the symplectic form (28) has rank 
22d. 
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Proof. If y,=---= ya- = 0 then 


22t+1-j 


Ls(n)= D ym” tm 


= Ls(n)", 


where degree Lalin) < 2”“~®*'. Thus the dimension of the space of 5 for which 
Lg(n)=0 is at most 2(t-—d)+1, and so rank B >2t + 1—2(t—d)—1 - 2d. 
Q.E.D. 


Theorem 16. If y,.4.2— yi... =: °° Ye =O then again the symplectic form (28) 
has rank 22d. 


Proof. The exponents of n»n which occur in Les(ņn) are now 
2, 25,..., 2 ***, 27, 207*....,2". Set B xn". Then the exponents of 
B in Le(m) are 1,2,....2' 4,2" 7^72,,,,, 2??? The highest power of f is at 
most 2(t — d) + 2, so the dimension of the space of B for which Ls(9) = 0 is at 
most 2(t — d) +2. Since m is odd, this dimension must be odd (Equation (20)), 
and is at most 2(t — d) * 1. Hence the rank of B is at least 2t + 1 —2(t — d) — 
1 = 2d. Q.E.D. 


By setting y; = 0 we are removing the idempotent 07 from the code. So as a 
corollary to Theorems 15 and 16 we have: 


Corollary 17. Let m = 2t * 1 be odd, and let d be any number in the range 
Ixd«zxt. Then there exist two 


[2", m(t-d*2)«1,277-2"7*71], 


subcodes of R(2, m). These are obtained by extending the cyclic subcodes of 
R(2, m)* having idempotents 


6, 0* + Dd OF, (30) 
j=ud 
and 
t—d+l 
00+ 0t Y, OF. (31) 
j=) 


These codes have weights 2"^" and 2" ''x2"^"" for all h in the range 
dshst. 


Proof. We have seen that the nonzero symplectic forms associated with the 
codewords have rank = 2d. The result then follows from Theorem 5. Q.E.D. 
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Remarks. From Equation (26) these codes have the maximum possible size. 
The weight distribution can be obtained from the results given in §8 of Ch. 21. 


Case (II). m even, m=2t+2. Again we are looking for maximal sets of 
symplectic forms such that every form has rank = 2d and the sum of every 
two distinct forms also has rank = 2d. But now for even m the resulting codes 
turn out to be nonlinear. We begin with the case d = 3m, in which case we get 
the Kerdock codes. Before giving the definition, we summarize in Fig. 15.6 
the properties of this family of codes. (For m = 4, see Fig. 2.19.) 












For even m = 4, X (m) is a nonlinear code with 


length n = 2", 
contains 2?" codewords, 


minimum distance 27^! — 2072, 


The first few codes are (16, 256, 6), (64, 2", 28), (256, 25, 120) codes, 
and the first of these is equivalent to the Nordstrom-Robinson code 
Nio. The general form for a codeword of X(m) is Equation (34). 2£ (m) 
is systematic. The weight and distance distributions coincide and are 
given in Fig. 15.7. Also R(1, m) C XX(m) C &(2, m). The codewords 
of each weight in X(m) form a 3-design. 






Fig. 15.6. Summary of properties of Kerdock code X (m). 





i Aj 
oo a ee 
De Dreg 22 (207k = T) 
2"7-! 2m*15 
2m-i 420-272 27 (amt 1) 
27 l 


Fig. 15.7. Weight (or distance) distribution of Kerdock code X(m). 


Definition of Kerdock code H(m). H(m) consists of AU, m) together with 
27-—] cosets of RU,m) in &(2, m). The cosets are chosen to be of 
maximum rank, m, and the sum of any two cosets is also of rank m. 
Alternatively, the Boolean functions associated with these cosets are quadra- 
tic bent functions ($5 of Ch. 14), with the property that the sum of any two of 
them is again a bent function. Since there are a total of 2"^' cosets (or 
associated symplectic forms), by Equation (27) 2 (m) is as large as it can be. 

We shall write the codewords of &(1,m) in the form [u|u * vj, where 
u € (1, m —1) and v € (0, m —1) (see Theorem 2 of Ch. 13). With this 
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notation 92(1, m) consists of the codewords 
[a |a6, * aix^8f |b |b0, 4 aix^0* |, (32) 
where a, a, b, bi E GF2) and 0x ix2"'-2. 


The Kerdock code X(m), form = 2t + 27 4,consists of €? (1, m) together with 
u 2 2"*  — 1 cosets of (1, m) in R(2, m) having coset representatives 








w - [ox X et io e(ot « > er) (33) 
f= izt 
for j=1,...,m where as usual J; = 1 * 2. Let w= 0. 
Thus the general codeword of J/(m) is 
| a | aOo+ aix^8F + ex! > O* |b | bA.+ a,x" O* + ex (ot + > er) 3 (34) 
f f= 


where € = 0 or 1 and 1 Sj<uz. 


Problems. (11) Let € be any code which consists of a linear code # together 
with | cosets of 3€ with coset representatives wj: 


l 
€=HUU (w+ H). 
Thus |€|- (1+ 1)| #|. X and € are represented by the elements 


t 
H=% 2° and €-H-«Y,z"H 
jc 


vE 


of the group algebra (see $5 of Ch. 5). Show that the distance distribution of € 
is equal to the weight distribution of 


(12) Show that the linear code generated by X(m) is R(2, m). 


The MS polynomials for the left and right halves of w; are 


L)- Y X oy, 


i-l sec, 
" 


and 


RG)- DY Oz) + 2, (zy, 


(I! sec, 
1 


respectively, where y € GF(2"*')* depends on j. 
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Let the elements of GF(2”*') be (£,—0,£,,..., £,). in some fixed order. 
Then 


wi = | L(&)| LCE), ..., L(&)| Ro) | RCE), -< REI 
where 


L(£.) = 2 Tal yé)", 


RE) = 2, Tu Q6)" + Tu (E). 


for Ovx y. If we write the elements of GF(2?'?) as pairs (£, e), where 
£€GF(2"*) and e-0 or 1, then we can use these pairs to index the 
components of w; (or other vectors of #(2,m)), where e = 0 on the LHS of 
w; and €= 1 on the RHS of w. 

For example, if t = 1 (so that w; has length 16), we write the elements of 
GF(2‘) as pairs (£ €), where £ € GF(2’) and € = 0 or 1, as shown in Fig. 15.8. 
Here y is a primitive element of GF(2‘) with y*+y+1=0, and o is a 
primitive element of GF(2’) with o? * a - 1 — 0. 


y 4-tuple £ 

0000 0 
1000 l 
0100 
0010 
0001 
1100 
0110 
0011 
1101 
1010 
0101 
1110 
0111 
a 1111 
1011 
y" 1001 


Fig. 15.8. The field GF(25) in.the (£, €) notation, £ € GF(2’). 


- 


«Wes << = © 


v 0 1 O w^ bh Ow WM 
^O win à w” 
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w 


For concreteness Fig. 15.9 shows w- and some codewords of &(1, 4), first 
(Fig. 15.9a) as codewords of an extended cyclic code with the overall parity 
check in the component labeled o, and second (Fig. 15.9b) indexed by the 
pairs (¢, €), £ € GF(2), e = 0 or 1. Figure 15.8 is used to convert from Fig. 
15.9a to 15.9b. Note that in Fig. 15.9 w,=x‘6* + 07 + 0* and hence is in 
R(2, 4). 
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(a) as codewords of an extended cyclic code. 









1 y y? y) yt y5 ye y! y® y? y' yt y? y? y hane 
10 10 1 i 10 w = x*0t + OF + OF 
10 10 1] 1 1 I a = et 
1001 1 0 1 O0 b = y*gr 
0100 1 0 0 0 C = 60,44 OF 
ve 





(Here 0* is an idempotent of block length 15 - see Problem 7 of Ch. 8.) 


(b) indexed by (¢, €), € E GF(2), € =0 or 1. 








name 


w= |0|6*|0|0* + o*| 


a  |0|0|1]6,] 
b = |o|x?er |1|0, + x0 | 
c = |1]6,]o[0| 


(Here 6* is an idempotent of block length 7 — see Problem 5 of Ch. 8.) 


Fig. 15.9. w, and some codewords of A(1, 4). 


Next we want a Boolean function f,(v) = f,(vi,. .., v2.2) which represents 
w; i.e. such that 


f, =w; (equality of vectors of length 27'*’). 


We set £ equal to the element of GF(2"*') represented by the binary vector 
Ui... , Un) and set € = 05,5. Then f, is given by 


fv... 02) = f, €) 


=D Ta (yé) ” + €Ta (y). 
Then indeed 5 
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The crucial result is the following. 


Theorem 18. f,(v) is a bent function, and for y# ô, f,(v) + f(v) is also a bent 
function. 


Proof. The proof is by showing that the symplectic forms corresponding to 
f, (v) and to f,(v) + fs(v) have rank 2t + 2. The symplectic form corresponding 
to f,(£, e) is, from Equation (5), 


&,( €), (n, e) = > Tuy? (E20 En) 


+ Tiay(em + €£)) 
= Taly En) + Ti aCyE) T2 aCym) 
+ Tas ily(ein + €2€)). (35) 


To find the rank of B, we must find the dimension of the space of (n, €;) such 
that B, ((£, €), (n, €2)) = 0 for all (£ €). By choosing (£ €i) = (0, 1) we deduce 
that T2,,, (yg) = 0, hence B, = Ta (Cy£)Cyn + €). So we must have yn + €; = 


0, n 
. Tunlyn + €27 0, 


~ Tasil€2) = 0, 
.. €; 7-0 (because 2t 4 1 is odd). 
D B, = Taly En) 


and so ņn must be 0. 

Therefore (n, €2) = (0, 0) is the only pair for which @,((&, ei), (n, €2)) = 0 for 
all (£, €,). Hence the rank of B, is 2t +2. 

It remains to show that B, + Bs has rank 2t 2 if y#6. The proof is 
similar to the preceding and is left to the reader. Q.E.D. 


Weight distribution of XX(m). From Theorem 5 a coset of 22(1, m) of rank m 
contains 2" vectors of weight 2"'' — 2"^" and 2" vectors of weight 2"^' 4 
2"7-'. The first-order RM code itself contains 2" —2 codewords of weight 
2"^' and the words 0 and 1. Hence the weight distribution of X(m) is as given 
in Fig. 15.7. From Problem 11 and Theorem 18 the distance distribution is also 
given by Fig. 15.7. In particular, X(4) is equivalent to Wis, since Mie is unique 
(88 of Ch. 2). 

In $6 we shall describe another nonlinear code, the Preparata code 2 (m), 
which has the surprising property that its weight (or distance) distribution {B;} 
is the MacWilliams transform of the weight distribution of #(m). Thus from 
Equation (13) of Ch. S, 


27^B, = P(0) -27Q"' - D(P.Q"! 20m 7 + pn + 20790) 
+(27*'~ 2)P(2""')+ PQ"), (36) 
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where P;(x) is a Krawtchouk polynomial (§2 of Ch. 5). It is readily checked 
that the minimum distance in A(m) is d' = 6. It then follows from Theorem 9 
of Ch. 6 that the codewords of each weight in H(m) form 3-designs (since 
$23) 


Case Il. m even, m =2t+2 (continued). We now study the general case, 
where d is any number in the range 1 « d «im. The nonlinear codes we are 
about to describe were discovered by Delsarte and Goethals, and are denoted 
by 2G(m, d). Here is the definition. 

If d 2» im let € = ACI, 2t + D, while if 1 d «im let € be the 


[27 (2t + I(t — d +2)4 1,2” — 27-4] 


code defined by Equation (31) of Corollary 17. Let wo=0, wi,..., w, be the 
vectors defined by Equation (33), and let v,2 0---01--- I. 


Definition. For 1x d xim =t+1 the code Z4(m, d) consists of all vectors 
IcGo|cGO| + wi + ev» ` (37) 


for c(x)E €, 0xj y, e-O0 or lI. 


Theorem 19. 2G(m, d), where m 22t - 224, is a code of length 27*? con- 
taining 2?'*'* 4*»*?'*? codewords and having minimum distance 27*' - 2?*'-4. 
If d c1, BG(m, 1) = RQ, m), while for 2x d «im, DG(m, d) is a nonlinear 
subcode of R(2, m). 


Proof. There are 2?'**74*2*?'? codewords of the form (37). That these are all 
distinct and in fact have distance at least 2?*! — 2?*'^ apart will follow from 
Theorem 20 below. Q.E.D. 


Note that 2G(m, d) contains the Kerdock code 2 (m) as a subcode. This is 
so because |c(x)|c(x)|- evo includes all codewords of 2(1,2t +2). Also 
DG(m, 4m) is equal to H(m). Thus these codes are a generalization of the 
Kerdock codes. From Problem 12 the linear code generated by 2G(m, d) is 
R(2, m). 

The first few codes DG(m,.d) are shown in Fig. 15.10, where k= log; 
(number of codewords) and 6 = minimum distance. 

96(m, d) is contained in 31(2, m) and is a union of cosets of (1, m). 
Consider the coset of &(1,m) which contains the codeword (37). The 
symplectic form corresponding to this coset is 
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DG(m, d) 











Fig. 15.10. The parameters (n, 2“, 6 = minimum distance) of the first 2(m, d) codes. 


BAE €), 0.6) = To (£2, Gm + Qm") 


+ Ta (y! £n) * Tore YET (yn) 
+ Ta a(y(6ig + €2)). (38) 


The first term comes from |c(x)|c(x)| via Equations (28), (29), and the rest 
from w; via Equation (35). y corresponds to w; The number of distinct 
symplectic forms of this type is (27*')'"“*?. From the following theorem and 
Equation (27) this number is as large as it can be. 

The crucial property of these codes is given by: 


Theorem 20. The rank of any symplectic form (38) is at least 2d, and the rank 
of the sum of any two such forms is also at least 2d. 


Proof. We write (38) as 


BCE, €), (n, €2)) = Ta (£LQ) 
+ T4 (EL, (0) + Tarsily (ern + €2)), 
where 


t-d+l 


L(a) = > (yin? + yn"), 


L,(n) = yn + YT» (yn). (39) 


It suffices to show that the sum B, + Bs, y# ô, of two such forms has rank 
22d. Since € is a linear code T;,..((£L()) + Ta.(£L'(m)) = Tu 4(GL'()), 
where L” is given by (39) for suitable y,’s. Therefore 
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98,((£, €). (n, €2)) + Bs (CE, €). (N, €2)) 
= Ta (EL"(n)) + Tare (EL, (n)) + Ta (£Ls(n)) + Tully + 8)(6m + e£). 


(40) 
It is convenient to define 
BE, n) z BCE, 0), (n. 0)) * 9B. ((£, 0), (m, 0), 
which is a symplectic form with £, n € GF(2"'). Then 
B'E, n) = Tus (£Ls(m). (41) 
where 
t-d+l ix 
L«(m)- $ (m* + (ymy) 
* (y! + 8)mn + yTaei(yn) + 9T, (8). 
Let the rank of @'(é, n) be 2h. We shall show that 2h 2 2d — 2. 
Expand 
2t«l 
Lo(n) = È Am”. 
Then 
Aim y""-8"7" for t-d+2<istt+d-1. (42) 


Let us assume that the underlying Boolean function is in the canonical form 
given by Theorem 4. Its MS polynomial is given by Equation (11). It is readily 
seen that the values of the corresponding symplectic form are T5. .(£Ls(»)), 
where 


Ls(n) = 2 (a;T« (Bim) + BT». (am) (43) 


and a,,...,0,, Bi,...,B, are linearly independent elements of GF(2”*'). 
Comparing (42) and (43) we see that, for t-d+2<i<t+d-l, 


h 
y"? 8"? = D (aB* + a? B;) 
f= 
i a RI 
=> (aj + Bj + (a; + B)! 7). 
j=l 
If we write 
h 
Si = y"? Bitty > (a)? + Br (a, + B;)'*?), 

j=l 


then clearly 


s=0 for t-d+2si<t+d-l. (44) 
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Let U be the set {y, 8, œi, Bi, o, Bi,..., On, Bn, Gn + Bh}, so that we can write 
5, = > alt”, (45) 


Let V(U) be the smallest subspace of GF(2?"*') which contains all of U. 
Clearly dim V(U) «2h * 2. Let o(z) be the linearized polynomial cor- 
responding to U (see $9 of Ch. 4), given by 


2h«2 


o(z)= Il (z+aA)= M az”. 
A€V(CU) i=0 


Note that o» ~ 0 (for otherwise o(z) is a perfect square). Then 


0-2 > aa” (a) 


ecu 


=o} > aT rg? > a es 


«cU «cU 


2 14202852 
+ O2 > a ‘ 
eau 


2h+2 


=o85,+ $ osu; by (45). (46) 
j=l 


Now we assume that 2h + 22d — 2 and arrive at a contradiction. This will 
prove that 2h +2 >2d -2 or 2h = 2d —2, as claimed. 

Setting i — t d +1 in (46) and using (44) we deduce that s, 4,, — 0. Then 
setting i = f—d,... implies s; = 0 for all O<ist+d-1. But so= y?* 6? x Q, 
a contradiction. 

Therefore the restricted form &’ has rank at least 2d — 2. If the rank of 9 
is 2d or more, so is the rank of 8, + B, and we are done. Suppose then that 
the rank of 2’ is exactly 2h = 2d —2. We need a lemma. 


Lemma 21. Let 2((é, €), (n, €2)) be the symplectic form 
BCE, €), (N, €2)) = T «(£L(9) + B(eié + em). 
where L(y) is a linearized polynomial over GF(2"*') and B € GF(2™*"). Let 
B'(E, n) = 23((£, 0), (n, 0)) 
= Tas (£L(n)). 
Then if the equation L(y) = B has no solutions in GF(27?'*!), 
rank 99 = rank 8’ + 2. 


Proof of Lemma. To find the rank of B we need to know the number of (n, €2) 
such that B((é, €), (n. €2)) = 0 for all (£ €,). Setting (£ €,) = (0, 1) we see that 
Tz .(Bm) = 0. Thus B = T4..(£(L(n) + Be). So we must have Lin) + Be; = 0. 
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If L(n) = B has no solution, the only zeros have e; — 0. In fact the zeros are 
((n, 0: Lin) = T2.(80) = 0). 
Now from (20), 
rank B’ = 2t + 1 — dim (kernel L(9)), 


and hence 
rank B = 2t + 2 — dim (kernel L(9)). 


Thus rank 2 > rank &’, i.e. rank 9 = rank 9' - 2 (since it cannot be more 
than this). This completes the proof of the lemma. Q.E.D. 


For the proof of the theorem, B and B’ are given by (40) and (41), and 
B = y+ ô. To complete the proof we must show that if rank %' = 2d —2 then 
the equation Ls(9) = y + 6 has no solution in GF(2”*'). 

Suppose then that y+ ô= L4(9) for some 7 € GF(2"*. From Equation 
(43) y + ô is linearly dependent on a,,...,0o,, Bi... s Br Le., 


h 
y+ô= Ñ (aa + bf), fora, b, € GF(2), 


j=l 


= 2, (aj(a; + biy) + b(B; + ajy)). 


j-1 


Let U' be the set 
U'={y + 6,a,4+ bry, Bit ay, ou Bir (ai bi)y,..., 
a, + Day, But Any, On + Bat (an + b,)y}, 


and V(U’) the smallest subspace of GF(2"*') containing all of U’. Clearly 
dim V(U") 2h = 2d — 2. Now defining 


a€cU' 


it is straightforward to verify that s;= s. Then by the same argument as 
before, s; — 0 for it d— 1l. But again s)#0, a contradiction. Hence our 
assumption that y+ ô= Les(ņ) is false, thus completing the proof of the 
theorem. Q.E.D. 


From Theorem 5 it follows that Z«(m, d) has distances 0,2" ', 2" and 
2"7"z2" ^" fordhst-l. In 88 of Ch. 21 we will show how to find the 
distance distribution of this code. The distance distributions in the special 
cases d 2 t 4 1l and d= t are given in Figs. 15.7 and 15.13. 


Problem. (13) Show that 9G@(m,d+1) is a union of disjoint translates of 
DG(m, d). 


466 Second-order Reed-Muller codes Ch. 15. §6. 


*§6. The Preparata code 


We now describe another nonlinear code, the Preparata code A(m), which 
has the property that its weight (or distance) distribution {B;} is the Mac- 
Williams transform of the weight distribution of the Kerdock code 3£(m), and 
is given by Equation (36). Of course the Preparata code is not the dual of the 
Kerdock code in the usual sense, since both codes are nonlinear. 

Y(m) can be constructed in a similar way to the Kerdock code, as the 
union of a linear code H and 2"! — 1 cosets of H in R(m — 2, m). Recall that 
A(m —2, m) is the extended Hamming code and is the dual code to 22(1, m). 

From Theorem 2 of Ch. 13, A(m —2, m) consists of the codewords 


\ulut+ov|:u even weight, v &Q&(m —3,m — 1); 
i.e. consists of the codewords 
JADI AGO AO) + KC) AG) + KG) + 92] 


where A(x) and k(x) are arbitrary. 


Definition. For m even, m = 2t +22 4, the Preparata code 9 (m) consists of a 
linear code IT (which is contained in (m —2, m)), together with u 227^ —1 
cosets of I in R(m — 2, m) having coset representatives w;,j — l,..., p. H 
consists of the codewords 


le(DIgGOCL + 02] fC) + (D 1g GOL + 0) + f(x) + 61+ 03], (47) 
where f(x) and g(x) are arbitrary, and 
w, -|1[x']O0|x'G]. j21l,....pn. (48) 
Thus IJ has dimension 2”~’~ 1— (m —1) +2" '-1-2(m - 1) 2" 3m «€ 1. H 
is a [16, 5, 8] code if m = 4, and a I2", 2" — 3m + 1,6] code if m z 6. 
9 (m) consists of the following 2^ codewords, where k = 2" — 2m: 
[g(D * alg(x)(1 + 0) + ax'| füD + g(D1 eG 6) 
+ f(x) + 8, + 63) + ax’ l, (49) 
where a =Q or l and 1 € j « p. (See Fig. 15.12.) 


The first Preparata code coincides with the first Kerdock code (and hence 
is equivalent to the Nordstrom-Robinson code Y): 


Lemma 22. 2 (4) — X(4), i.e. the Preparata and Kerdock codes of length 16 are 
the same. 

Proof. The primitive idempotents for length 7 are 6o, 61, 03 with 0* —0.,0* = 
6,, 00+ 01 + 0, = 1. Thus II consists of the codewords 





Ch. 15. §6. The Preparata code . 467 


| e(1)| g(x )(80 + 0f) + g0)]gG(0o + 6:) + f(x) Bo] 
= |g(1)| (1) 60+ oix'0 |f) + g(1)| f(1) 80 + g(1)0o 
+ aix" 6t |. 


Therefore II = &(1, 4). 
From Equations (33) and (48) the coset representatives for X(4) and P(4) 
are respectively 


wi(H) = |0|x/6* |O] x'(0* + 0*)], 
w(P) = |1|x'|0|x'8,]. 


Their sum is 
|1|x (8o + O*)|0| x0* |, 


which is in 22(1, 4). Hence w,(H) and w,(2) define the same coset of R(1, 4). 
Therefore A(4) = X(4). Q.E.D. 


Problem. (14) Show that the linear code generated by A(m) is a [2”, 2" — 2m + 


1,4] subcode of R(m —2, m) consisting of all codewords of the form 


| F(1)| F(x) |e  F(1) + G(1)|€0 + F(x) + G(x)(1 + 6, + 65)|. 


Lemma 23. II^ consists of the codewords 
|c|COo + cix ^8 + csx^8f|d|d6s + dix^0* + csx" OF, 


where c, ci, Cs, du di are 0 or 1 and i, is, ji E (0,...,2" ' - 2}. 


Proof. We may write IT = s£,-- #2, where sf, and £- consist of the vectors 


JOJO] f(D]| f(x 0, + 65) 
le(1)| g(x) + 6)]g(OD g(x) + 6,)| 





, 


respectively. Then s£; and 5£7 consist of the vectors 


|c|h.G| d |d6s + dix^et + csx^ot |, 
|e, | (ei + €2)00+ h(x) + cix" OF | e| hx) 











€; 


respectively. Therefore in II^ = £i (1.54; (see Problem 33 of Ch. 1) we must 
have c = ei, h(x) = (€i €2)00+ h(x) + cix^0F, etc. These equations imply that 
IT’ is as stated. Q.E.D. 


Note that 2(1, m) C II^ C 9?(2, m) (see Fig. 15.11). 
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Fig. 15.11. 


We come now to the main theorem of this section. 


Theorem 24. The weight (or distance) distribution (Bi) of 9(m) is the 
MacWilliams transform of the weight distribution of H(m), and has the 
generating function 


2m 
27" Y Biz! = (14+ z)"^ «277^ - 1) 
i-0 


2m- V4 20m -23/2 2m-1-20 - 202 


- ((19- z) (1— z) 
T (1 + zy ee = gy en 
+(2"*'— 2) = az? + (1 — zy”. (50) 


Alternatively B, is given by Equation (36). 


Proof. The distance distribution of A(m) is (from Problem 11) the weight 
distribution of the element 
2m 1-4 


X z“H+ zr > z"'"H 


-3 
2" -asi 2 Ieicjx27"7!-| 


D=H+ 





of the group algebra, where 


Hcc 


vell 


The MacWilliams transform of this weight distribution is the set of numbers 


At x.(D), O</1 <2", 


1 
I9 (m)| 23 
where the sum is over all vectors u of length 2" and weight | (from Theorem 
5 of Ch. 5). 
We show that A; = 0 unless | is one of the numbers 0, 2"! x 2" 77, 277! or 
2”; i.e. one of the weights of %(m); and that in fact the set of numbers Ai is 
the weight distribution of X(m). 
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We consider separately the contribution to A; from u € 2(1, m), uE 
II* — &(1, m) and už II. 


Case (D). u € R(1, m). 
Lemma 25. If u € (1, m), x.(D) = |9 (m)]. 
Proof. 2(m)C ACI, m)', thus (- D^ *- 1 for all v € 9(m). Q.E.D. 


Corollary 26. The codewords of (1, m) contribute 1 to Ao, 1 to A»- and 
27* —2 to Ai. 


Case (ID. ug II^. 
Lemma 27. If už II^, x.(D) — 0. 


Proof. We have y.(H)=0, hence x,(z"H)-(-D" "x,(H)70, and 
x.(z2"* "H) = (- D)" "*"?y,(H) = 0. Q.E.D. 


Case (III). u € II^ — (1, m). This case is more complicated. We must cal- 
culate 


20-151 


xD) =x (D(t Z CD" 


1 
+— 


9m 
Ieicj«27 !-1 


cams) 


Suppose u: w; is even for a values of i and odd for B values of i. Then 
clearly 


27"—1-a-c, 


2 =i 


Y Cy *= a-p. 


Also u- (w; + w;) is even for ($)+(§) pairs w, w; and odd for af pairs. Thus 


=la — BY — (o + B)}, 


and 





X.(D) = 1 aD 2 
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We now find the possible values of p. 


Lemma 28. Let A(1, m) s be a coset of (1, m) in II. Then for any u in the 
coset, (- 1)" "-(-1)"^ *. 


Proof. u=s+r, rc (1, m), and (- 1^" =1. Q.E.D. 


The dimension of II^ is 3m — 1, and the dimension of A(1, m) is m + 1, so 
there are 27"? cosets. We take the coset representatives to be 


s 2 |0|c,x^ 0f  c;x^07| 0 [esx OF |, 


where ¢,, c, E GF(2), 0x i, h x 2"! — 1. 

Consider the scalar product of s and w; = |1|x/|0|x/0,|]. Now x’0,- x°@* is 
always even (since the dual of the code generated by 6, is generated by 
1+ 6*=3,.,6*, and this contains the code generated by 07). The scalar 
product is 1 if xÍ has a nonzero coefficient in c,x^0* + c,x^50*, and is 
otherwise zero, hence B = wt (c,x"0* + c,x"6*). Figure 15.3 shows that this 
vector has weight 0, 2" ", or 2" ^ € 2" 7, 

If B 2 2" 7, then a = 2" ^— 1 and from (51) x.(D)/|8 (m)| = 0. 

If B 227? £2" then a 2277? x 2"? —1 and x,(D)/| 9(m)| = 1/277. 

If c,x^0* + c,x*07 has weight 2" 7^ € 2", the coset representative s has 
weight 2"! € 27?" since x50* has weight 2". Thus the coset of A(1, m) in 
I’ is a maximum rank coset in 22(2, m); and contains 2" vectors of weight 
277 4 27"-?? and 2" vectors of weight 2"! — 2" ?"? This coset contributes 
27/2"? = 4 to A; for I 22"! x 2 ?". Now from Fig. 15.3 the total number of 
s of weights 27^ ' 4 2"7?? and 27^! — 2" 7" js 2"?0"'"' — 1). This completes 
Case III. 

The contributions to A; from the three cases are as follows: 


Case: I II TIT 
Ao 1 0 0 
Al 275 —2 0 0 
Aj 1 0 0 
Aber tuum onte 0 0 27(2" '=1) 


Adding the three cases we obtain the weight distribution of the Kerdock code 
as shown in Fig. 15.7. This proves that the distance distribution of A(m) is the 
transform of the weight distribution of X(m). It remains to show that the 
weight and distance distributions of 9 (m) coincide. 

It is readily checked from (50) that A(m) has minimum distance 6. Since 
H(m) has $3 it follows from Theorem 3 of Ch. 6 that A(m) is distance 
invariant. Q.E.D. 
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Corollary 29. A(m) has minimum distance 6. 


Remark. 2(m) has the same length and minimum distance as the [2", 2” — 
2m —1,6] extended BCH code, but contains twice as many codewords. We 
shall see in Chapter 17 that A(m) has the greatest possible number of 
codewords for this minimum distance. 

The properties of A(m) are summarized in Fig. 15.12. 





For even m = 4, 9 (m) is a nonlinear code with 


length n — 2", 
contains 2" ?" codewords, 
minimum distance 6. 


The general form for a codeword is Equation (49). 9 (m) is 
systematic and quasi-perfect, contains twice as many codewords as 
an extended double-error-correcting BCH code of the same length, 
and has the greatest possible number of codewords for this 
minimum distance. The weight and distance distributions coincide 
and are given by Equations (36) or (50). 2(4) = X(4) = (16, 256, 6) 
Nordstrom—Robinson code. Also R(m —3,m)C 9(m)C 
R(m —2, m). The codewords of each weight in A(m) form a 
3-design (Theorem 33). 





Fig. 15.12. Properties of Preparata code A(m). 


The shortened Preparata code 9 (m)*. Let 9 (m)* be the code obtained from 
9 (m) by deleting one coordinate place (any one). In this section we show: 
(1) The weight and distance distributions of A(m)* do not depend on 
which coordinate is deleted (Theorem 32). 
(2) The codewords of each weight in 2 (m)* form a 2-design (Theorem 33). 
(3) The whole space F?"*' is the union of disjoint translates of 2 (m)* by 
vectors of weights 1, 2, and 3; i.e. 2 (m)* is quasi-perfect (Theorem 34). 
Let € be any code containing 0, of even block length n, and such that all 
codewords have even weight. Let €* be the code obtained from € by deleting 
a fixed coordinate place, which for convenience we suppose to be the last. Let 
(A) and (a) be the distance distributions of € and @*. As usual the 
transformed distributions are denoted by primes. 


Theorem 30. 
A‘ = ait ani 


Proof. Let o; = {u E€ F": wt (u) = i}, 6; = {u E F" : wt (u) = i}. The transforms 
of the distance distributions of €* and € are, by Theorem 5 of Ch. 5, 
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fao X cate, 


WEG; û Egt 


lera» (pese, 


wo; uvE€ " 

= > > (-1)* e? 
weóju,óee* 
+ > > (- 1 tee ea ton, 


weEdg_| 4,0E€* 


where we have written w — [v |w,|, u =|ĉ|un| etc., with 


since € is an even weight code. Therefore 
IEPA =|8 a+ € X (pmo 
Eg,- Å, EGH 


But the last term is |8 | ari Q.E.D. 
Corollary 31. A; = An~, in particular Ao = A= 1. 


Remark. Theorem 30 and Corollary 31 are trivial for linear codes, since {Aj} 
and (a?) are the weight distributions of €^ and (€*)'. 


Problem. (15) For linear codes show that (€*)* is obtained from €* by 
deleting the fixed coordinate place and discarding all resulting odd weight 
vectors. 

We now apply Theorem 30 to A(m); viz. 


t 
Abs yo na = Agm -t gin 22 = Gym ry mom + ys -1_yim-D2, 
Amt = Zapri. 


Thus the transform of the distance distribution of A(m)* contains s’=3 
nonzero terms besides aj = 1. Since A(m)* has minimum distance d at least 5, 
we can find a-z", aj: from Theorem 2 of Ch. 6. Since s’<d, by 
Theorem 6 of Ch. 6 this is also the transform of the weight distribution. The 
result is given in the following theorem. 


Theorem 32. Let P(m)* be the punctured Preparata code of length n — 1, 
where n = 2", m even 24. 
(1) The transform of the distance (and weight) distribution of 9 (m)* is: 
do = l.a,,—n-—1, 


vai «(n — 2)(n + Vn), (sevi = a(n — 2)(n — Vn). 
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(2) The distance (and weight) distribution of P(m)* is given by the 
generating function 





Pm à 7 7 0O*27 


+i(n + V/n)(n S 2) En zjoni La re i 
+ X(n EN V/nin ma 2)1 4 jg ec vae a gne 
+n- DXI 9 z) ^'(1— zy". 


where | 2(m)| = |9(m)*| = 277" 7. 

(3) The distance (and weight) distribution of P(m)* does not depend on 
which coordinate of 9 (m) is deleted. 

(4) The minimum distance of P(m)* is 5. 


Theorem 33. The codewords of any fixed weight in 9 (m)* form a 2-design. In 
particular the 


as = a(2" — 1)(2" -2)2" - 4) (52) 
codewords of weight 5 in 9$ (m)* form a l 
2—(2"—1,5,1Q" — 4)) design. (53) 
The codewords of any fixed weight in P(m) form a 3-design. In particular the 
As = x2" 2" — 1)(2” — 2)(2” - 4) (54) 
codewords of weight 6 in P(m) form a 
3 — (27,6, (2" — 4)) design. (55) 


Proof. For A(m)*, d — 5 and s'=3, so the first statement follows from 
Theorem 24 of Ch. 6. Similarly for the third statement. To find the value of A 
in (53), observe that the annihilator polynomial of A(m)* is 


ao) = 27 (I~ ger am eme - 2) 
= Pix) Pi) + 1 (Pax) + PsQ0), G6) 


where r = 32" — 1). From Theorem 23 of Ch. 6, 


Il). bn 
Ae 730 di 
as is the number of blocks in this design and from Equation (21) of Ch. 2 is 
given by Equation (52). A, then follows from Theorem 14 of Ch. 8. Q.E.D. 








474 Second-order Reed-Muller codes Ch. 15. §6. 


Problem. (16) Show that there are 
ae = Wl?” — 1)(2” — 2)(2” — 4)(2” — 6) 


codewords of weight 6 in A(m)*. 


The last two theorems in this section show that, first, the whole space, and 
second, the Hamming code, is a union of disjoint translates of A(m)*. 


Theorem 34. The whole space F""', n=2”, is the union of disjoint translates 
of 9(m)* by all vectors of weights 0,1,2, and 2"! — 1 vectors 8n... , 7-4 
of weight 3. 


Proof. Let (a;(f)) be the weight distribution of the translate f +PA(m)*. We 
shall use Equation (19) of Ch. 6, where (from Equation (56) above) ao» = a = I, 
@=a,=I1/r. Therefore wt(f) 21 implies a«(f) 20, aif) 2 1, and a(f)= 
ax(f) 7 0. Also wt (f) =2 implies a«(f) = ai(f) = 0, af) - 1 and a(f) » r-1— 
3(2™ — 4). 

The 


0-(27- Du) 


translates of A(m)* by vectors of weights 0, 1 and 2 are clearly disjoint. The 
number of remaining vectors in F"'' is |P (m)*|(2"^' — 1). These must lie in 
translates with adf) = ai(f) = axf) 4 0, ax(f) r. 

Let T be the set of vectors of weight 3 in F"''. Write T = T;U Ts, where 


T.-(f€T:dist(f,u) -2 for some u € A(m)*}, 
T.={f ET: dist (f,u) 23 for all u € 9(m)*). 


We shall show that the remaining translates of 9 (m)* are g + 9(m)* for 
those g € T, which have a | in any fixed coordinate. 


Problems. (17) If g € T;, show that the translate g + A(m)* is disjoint from all 
translates f + A(m)*, where wt (f) 2 0, 1 or 2. 

(18) If 2g, 2: € T, have a nonempty intersection, show that the translates 
21+ 9 (m)* and g.+ A(m)* are disjoint. 


By Theorem 33 there are i(2" — 4) codewords of weights 5 in A(m)* which 
have I’s in any two fixed coordinates. Each of these codewords contains 3 
vectors of T; with 1’s in these two coordinates. The total number of weight 3 
vectors in F"™' with 1’s in these two coordinates is 2" — 3. Hence the number 
in T, is 27^ —3)—(2"—4)- I. 
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Thus the vectors of T, form a 2—(2"—1,3,1) design; hence also a 
1-(2" — 1,3, A) design with A —2"' —1 from Theorem 9 of Ch. 2. Therefore 


there are 2"! — 1 vectors g:,...,@2"-'-: in Ts with a 1 in a fixed coordinate. 
The translates of A(m)* by these vectors are disjoint and exactly cover the 
remaining vectors in F"''. Q.E.D. 


Corollary 35. A(m)* is quasi-perfect. 


Theorem 36. The union of P(m)* and the translates g; * 2(m)*, i= 
1,...,2"' — 1, forms a linear [27 — 1, 2" 1 2^ m,3] Hamming code. 


Proof. (1) We first show that for j=0,1,...,2"'-2 


[x' |1]x' | (57) 
and 
| x’63|0|0], (58) 


define the same coset of A(m)*. In fact, their sum is | x‘(1 + @3)|1|x‘|, which is 
indeed in A(m)* -set g(x) = x’@3, f(x) = x! and a = 1 in Equation (49). 

dI) We next show that the coset of 9 (m)* defined by (58) has minimum 
weight 3; thus the vectors (57) are in fact the same as the vectors g1,...,g, of 
Theorem 34. A typical vector of such a coset is 


|@(x)(1 + 06) - ax! + x'65| f D |g G0(1 4-6) + f(x)(0 + 0: + 63) + axt. (59) 


If a — 0, then the RHS is in the Hamming code, so has weight at least 3 unless 
it is 0. If it is 0, then f(x) = g(x) 2 0 and the LHS has weight 2" 7. If a = 1, the 
LHS and the RHS are both nonzero. Including the parity check in the middle, 
the weight is > 3. 

(III) Therefore the translates of A(m)* by the vectors (57) (or 58)) form a 
code with 2?""'"™ vectors and minimum weight 3. It remains to show that this 
code is linear; by Problem 28 of Chapter 1 it is therefore equivalent to a 
Hamming code. We must show that the sum of two vectors of the form (59), 


(g(x) + Bax) + 8) + aix^ + ax? + (x + x)8.| 
FO) + foC1) | (aie) + AxA + 61) + GiGD + 
+ f + 01+ 04) + (aix^ + arx)O, |, (60) 
is again of the form (59). Only the case a, = a2= | gives any trouble. Since the 
codewords of weight 3 in a Hamming code form a 2-design (Theorem 15 of 
Ch. 2), there is a g3(x) and a j, such that 
x^ + x2 +x = ga(x)(1 + 61), (61) 
and, multiplying by 6:, 
(x^  x5)0, = x^6,. (62) 
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Note that the x'0, are coset representatives for cosets of the double-error- 
correcting BCH code in the Hamming code (of length 2”~'— 1). Hence we can 
write 


gx XX1 + 01) = x^0,4 B(x)(1 + 0; 03), (63) 
for suitable j, and B(x). Also 
(1 + 0:){1 + 6,4 03) = 14+ 010 63. (64) 


From Equations (61)-(64), we see that (60) is of the form of Equation (59) 
with g(x) = gi(x) + g(x) + B(x)(1 + 8, + 63), a = 1, j = js, x'6 = (x^? + x? + x90, 
(this can be done since 6, is the idempotent of a simplex code), and 
f(x) = fix) + fx) + Boo + 6, + 03). Q.E.D. 


Problems. (19) Preparata defined 9 (m) to consist of all codewords 
| m(1) + q(1)| m(x) + q(x)| i| m(x) + (m(1) + i65 s(x) + q09)6:| 


where m(x) is in the Hamming code generated by M(x), s(x) is in the 
distance 6 BCH code generated by M(x)M(x)M°(x),, q(x)€ 
{0, 1, x,...,x?" 77, and i=0 or 1. Show that this defines the same code as 
Equations (47) and (48). 

(20) (a) Show that the first 2" —2m bits of A(m)* can be taken to be 
information symbols; hence the code is systematic. 

(b) Show that the remaining symbols of (4)* are quadratic functions of 
the information symbols. | 

(21) In the notation of Problem 16 of Ch. 14, show that the Nordstrom- 
Robinson code consists of the [16,5, 8] Reed-Muller code and the seven 
cosets corresponding to the bent functions 


[ILE e & 


(22) Show that the nonlinear code of Fig. 5.1 consists of the vectors 
(xi, ..., Xs) where the x; satisfy five quadratic equations. 


*§7, Goethals’ generalization of the Preparata codes 


We saw in $5 that the codes (m. d) generalize the Kerdock codes, for 
9 €(m, m[2) = H(m) (for m even). The Preparata codes, as we have just seen, are 
a kind of dual to the Kerdock code, in the sense that the weight distribution of 
P(m) is the MacWilliams transform of that of H(m). In this section we 
construct a nonlinear triple-error-correcting code (m) which is the same 
kind of dual of (m, i(m — 2)). 
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The construction is due to Goethals [495] and [496], and we refer the reader 
to his papers for a proof of the following facts. 

For m =2t+226, (m) is a nonlinear code of length 2", containing 2" 
codewords where k = 2" —3m + l, and with minimum distance 8. The first 
few codes have parameters 


(64, 2", 8), (256, 277, 8), (1024, 275,8), .... 


(m) contains 4 times as many codewords as the extended triple-error- 
correcting BCH code of the same length. 

As usual, (m) consists of a linear code 7m together with 2"^' — 1 cosets of 
T. T consists of the codewords 


lg(DIgoX0 + 601f (D g(D[gGo0(0 + 0) + FOU + 0,26, 6.))], (65) 


where r=1+2'"' and s=1+2'. Hence m has dimension 2" — 4m +2. The 
cosets representatives w; are given by Equation (48). 

The weight and distance distribution of .£(m) coincide, and are equal to the 
MacWilliams transform of the weight distribution of (m, i(m —2)) The 
latter distribution is given in Fig. 15.13 (although this won't be proved until 
Ch. 21). 


i Ai 
0 or 2”? l 
2201 zr 2^t(22t*1 = 1)(27'*? — 1)/3 
22 T2! ra hs Do Jg. 4)/3 
2208 2(27*? — D? nt 2 + 1) 


Fig. 15.13. Weight (or distance) distribution of 2«(m, :(m — 2)). 


Research Problems (15.2). Let 2'(m) be obtained from 9 (m) by changing 6; to 
0, in the linear subcode. Does 9"'(m) have the same properties as (m)? 
(15.3) Find a code whose distance distribution is the transform of that of 
9X6(m, d), for any d. 
(15.4) Show that 2 (m) and X (m) contain at least twice as many codewords 
as any linear code of the same length and minimum distance. What about 
DG (m, d) and (m)? 


Notes on Chapter 15 


82. Theorem 2 is due to Albert [18], see also MacWilliams [878]. For Theorem 
4 see Dickson [374, p. 197]. Theorem 8 is from Sloane and Berlekamp [1232]; 
see also McEliece [936]. Properties of the codewords of minimum weight in 
912, m) have been studied by Berman and Yudanina [138]. 
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For Gaussian binomial coefficients see Berman and Fryer [134], Goldman 
and Rota [519] or Pélya and Alexanderson [1066]. For Problem 4 see Sarwate 
[1144]; Problem 5 is due to Delsarte and Goethals [362] and Wolfmann [1431]. 


$3. Theorem 10 is from Kasami and Tokura [745], and extends the results of 
Berlekamp and Sloane [132]. 

Theorem 12 extends an earlier result of Solomon and McEliece [1256]; the 
proof is given in [939] and [941]. This result has been extended to Abelian 
group codes - see Delsarte [347] and Delsarte and McEliece [369]. Van Lint 
[848] has given a direct proof of Corollarv 13. 


§4. The weight distributions given in $4 are due to Kasami [727, 729]. 


$8. For odd values of m the linear codes described in this section were 
analyzed by Berlekamp [118] and Delsarte and Goethals [364]. The weight 
distributions of other subcodes of &(2, m) can be found in [118,729]. See also 
Dowling [384]. 


Research Problem. (15.5) 3X(4) = 9 (4) is unique. Are either 3XX(m) or P(m) 
unique for m > 4? 


We mention (without giving any more explanation) the following result: 


Theorem 38. (Berlekamp [120], Goethals [493], Snover [1247].) The automor- 
phism groups of the Nordstrom—Robinson codes X(4)* and H(4) are respec- 
tively the alternating group 41, and £- extended by the elementary abelian 
group of order 16. The latter is a triply transitive group of order 16.15.14.12. 


The Kerdock codes were discovered in 1972— see Kerdock [758]. The 
description given here follows N. Patterson (unpublished) and Delsarte and 
Goethals [364]. The generalizations to the codes 2«(m, d) were also given in 
[364]. Another way of looking at the Kerdock codes has been given by 
Cameron and Seidel [235]. For a possible generalization to GF(3) see Pat- 
terson [1031]. 


$6. The Preparata codes were introduced in [1081]; (see also [1080, 1082]). In 
this paper Preparata also shows that these codes are systematic, and gives 
algebraic encoding and decoding methods for them. Mykkeltveit [978] has 
shown that the Kerdock codes are systematic. The weight distribution of 
$ (m) was first obtained by Semakov and Zinov’ev [1181]. In Ch. 18 we shall 
use the Preparata codes to construct a number of other good nonlinear 
double-error-correcting codes. 
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The Preparata codes are the chief example of a class of codes called nearly 
perfect codes (the definition will be given in Ch. 17). A more general class of 
codes are uniformly packed codes. See Bassalygo et al. [78, 79], Goethals and 
Snover [504], Goethals and Van Tilborg [505], Lundström et al. [844a, 844b], 
Semakov et al. [1183] and Van Tilborg [1326, 1327]. The proof of Theorem 36 
follows Zaitsev et al. [1451]. 


§7. The material in this section is due to Goethals [495] and [496]. Research 
problems 15.2 and 15.3 have been solved by Goethals [498]. 





Quadratic-residue codes 


$1. Introduction 


The quadratic-residue (QR) codes 2, 2, M, X are cyclic codes of prime 
block length p over a field GF(/), where l is another prime which is a 
quadratic residue modulo p. (Only the cases | = 2 and 3 have been studied to 
any extent.) 2 and W are equivalent codes with parameters [p, Xp + 1), d > 
V p], while 2 and Ñ are equivalent codes with parameters [p,Xp- 1), d > 
Vp). Also 2 D 2 and Y DW. These codes are defined in 82 and a summary of 
their properties is given in Fig. 16.1. 

Examples of quadratic-residue codes are the binary [7,4,3] Hamming 
code, and the binary [23, 12, 7] and ternary [11,6,5] perfect Golay codes $$; 
and 4, (see $10 of Ch. 6 and Ch. 20). Other examples are given in Fig. 16.2. 

Thus QR codes have rate close to 3, and tend to have high minimum 
distance (at least if p is not too large, but see Research Problem 16.1). Several 
techniques for decoding QR and other cyclic codes are described in $9. The 
most powerful method is permutation decoding, which makes use of the fact 
that these codes have large automorphism groups. 

Other properties of QR codes discussed in this chapter are idempotents 
($83), dual codes and extended codes of length p + 1 ($4), and automorphism 
groups (85). Extended QR codes are fixed by the group PSL,(p) if | = 2, or by 
a slight generalization of this group if I ^ 2 (Theorem 12), and in many cases 
this is the full group (Theorem 13). 

Further properties of binary QR codes are described in $6, including 
methods for finding the true minimum distance. It is also shown in $6 that 
some QR codes have a generator matrix of the form [I| A], where A is a 
circulant or bordered circulant matrix. Such codes are called double circulant 
codes. Double circulant codes over GF(2) and GF(3) for a particular choice of 
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A are studied in §§7,8. These have properties similar to QR codes, although 
less is known about them. The codes over GF(3) are the Pless symmetry 
codes. 

In this chapter, proofs are usually given only for the case p = 4k — 1; the 
case p = 4k + 1 being similar and left to the reader. 


§2. Definition of quadratic-residue codes 


We are going to define quadratic-residue (QR) codes of prime length p over 
GFU), where | is another prime which is a quadratic residue mod p. In the 
important case of binary quadratic residue codes (l = 2), this means that p has 
to be a prime of the form 8m +1 (by Theorem 23 of the Notes). 

Let Q denote the set of quadratic residues modulo p and N the set of 
nonresidues. (See 83 of Ch. 2 and $6 of Ch. 4). If p is a primitive element of 
the field GF(p), then p* € Q iff e is even, while p* € N iff e is odd. Thus Q is 
a cyclic group generated by p°. Since | EQ, the set Q is closed under 
multiplication by l. Thus Q is a disjoint union of cyclotomic cosets mod p. 
Hence 


q(x) = [To - «^ and n(x) = [L6 72? (1) 


have coefficients from GF(I), where a is a primitive p'"" root of unity in some 
field containing GF(I). Also 


x^—12(x- Dq(x)n(x). (2) 
Let R be the ring GF(1)[x]/(x’ — 1). 


Definition. The quadratic-residue codes 2, 2, N, X are cyclic codes (or ideals) 
of R with generator polynomials 


q(x), (x — Da(x), n(x), (x — Dn(x) (3) 


respectively. Sometimes 2 and W are called augmented QR codes, and 2 and 
Ñ expurgated QR codes. 

Clearly 2 D 2 and N DW. In the binary case 2 is the even weight subcode 
of 2, and W is the even weight subcode of N. 

The permutation of coordinates in R induced by xx" for a fixed 
nonresidue n interchanges 2 and M, and also 2 and X, so that these codes are 
equivalent. 2 and M have dimension Xp + 1), and 2 and Ñ have dimension 
Xp — 1). (See Fig. 16.1.) 


Example. If | 22 and p —7, then 2 has generator polynomial (x + a)(x + 
a’)(x + a^) 2 x - x - 1l, where a € GF(2), and is the [7, 4, 3] Hamming code. 
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2 is the [7, 3, 4] subcode with generator polynomial (x + D(x * x + 1). M is the 
equivalent [7,4,3] Hamming code with generator polynomial (x + a?)(x + 
a^)(x + a®) = x! + x? * 1. (Note that a different choice of a would interchange 
2 and W.) 


The Golay codes. It has already been mentioned in §6 of Ch. 2 and §10 of Ch. 
6 that there are two perfect Golay codes, namely a [23, 12, 7] binary code Y;3 
and an (11, 6, 5] ternary code 4,,. One definition of Y 3 was given in Ch. 2. We 
now give a second definition and also define 4,,, as QR codes. 


Definition. The Golay code Gz, is the QR code 2 (or M) for | = 2 and p = 23. 
The Golay code €,, is the code 2 (or M) for 1 2 3 and p = 1I. These have 
parameters [23, 12, 7] and [11,6,5] respectively. (It will be shown in $7 that 
the two definitions of 43 are equivalent. This will also follow from Corollary 
16 of Ch. 20.) 


Over GF(2) (see Fig. 7.1) 
x?-12 (x * DIF x OF KOE x -x*-x'DBDx"-x--x'-x*^-x-V-x-0D) 
so the generator polynomial of 4 can be taken to be either 
xe xt ot xl-x'-x!'1 or x"4x?"-x'-x5v-x!-x1. (4) 
Over GF(3), 
x'—I12(x-DGoc^x*-x'-x'- Dio? -x- x -x- 1), 
so the generator polynomial of 4, can be taken to be either 
xi-x'-x-x'-1 or i x*-x-]. (5) 


Other examples are shown in Fig. 16.2. (Upper and lower bounds on d are 
given in some cases.) 


They are defined by (3) and are codes over GF(I) with 






length p = prime, D 
dimension Xp + 1) for 2, M; p- 1) for 2, V, 
minimum distance d > V p, 






where | is a prime which is a quadratic residue mod p. If p = 4k — 1. 
d’—d+12p (Theorem 1). Idempotents are given by Theorems 2 
and 4. For generator matrices see Equations (22), (23), (28), (31), (39), 
(44), (45). If p - 4k 1, 2'=9, =Ñ; if p=4k +1, 2°=N, 
N+ = J. The extended codes 2, Ñ are defined by adding the overall 
parity check (27). If p — 4k — 1, 9 and Ñ are self-dual: if p = 4k +1, 
($) 2 A. Aut ($) contains PSL:(p) (Theorems 10 and 12). Figure 
16.2 gives examples and Fig. 16.3 properties of the binary codes. 
Fig. 16.1. Properties of quadratic-residue codes 2, 2, X, 9. 
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(a) Over GF(2) 


n k d n k d n k d 

8 4 4 74 37 14 138 69 14-22 
18 9 6 80 40 16 152 76 20 
24 12 8 90 45 18 168 84 16-24 
32 16 8 98 49 16 192 96 16-28 
42 21 10 104 52 20 194 97 16-28 


(b) Over GF(3) 


n k d n k d n k d 
12 6 6 48 24 15 74 37 ? 
14 7 6 60 30 18 84 42 ? 
24 12 9 62 31 ? 98 . 49 ? 
38 19 ? 72 36 ? 


Fig. 16.2. A table of extended quadratic-residue codes $, 


The square root bound on the minimum distance. 


Theorem 1. If d is the minimum distance of 2 or N, then d? 2 p. Furthermore, 
if p — 4k — 1, then this can be strengthened to 


d'-d*lzp. (6) 


Proof. Let a(x) be a codeword of minimum nonzero weight d in 2. If n is a 
nonresidue, G(x) = a(x") is a codeword of minimum weight in M. Then 
a(x)a(x) must be in 2 NN, i.e. is a multiple of 


II 6 -a [I «7a» = [6-257 5 »- 


Thus a(x)á(x) has weight p. Since a(x) has weight d, the maximum number of 
nonzero coefficients in a(x)à(x) is d?, so that d^ > p. 

If p — 4k — 1, we may take n = — 1 by property (Q2) of Ch. 2. Now in the 
product a(x)a(x ') there are d terms equal to 1, so the maximum weight of 
the product is d?— d 4 1. Q.E.D. 


Example. The [7,4,d] quadratic residue code over GF(I) has d 23 (and if 
l=2,d=3). 
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As the codes in Fig. 16.2 show, d is often greater than the bound given by 


Theorem I. 


Research Problems (16.1). Fill in the gaps in Fig. 16.2, extend these tables, and 


compute similar tables for other primes l. 


(16.2) How does the minimum distance of QR codes behave as p >œ? 


83. Idempotents of quadratic-residue codes 


The case | = 2 is considered first. 


Theorem 2. If | 2-2 and p —4k-— |, then a can be chosen so that the 


idempotents of 2, 2, N, N are 
E,(x)= x, F,(x)=1+ D x, 
rEQ nEN 
E(x)9- Sx Bi=1+ > x, 
nEN reQ 


respectively. 


(7) 


Proof. Since 2 is a quadratic residue mod p (Theorem 23), (E,(x)) = E,(x), 
etc., so these polynomials are idempotent. Thus E,(o') = 0 or 1 by Lemma 2 


of Ch. 8. For any quadratic residue s, 
E,(a‘)= 9, a" = 9 a" =E,(a), 
rEQ rEQ 
independent of s. Similarly 
E,(a')= $ a" = yo" = E,(a"'), 


for any nonresidue t. Since E,(a)+ E,(a~')=1, either 
E,(@‘)=0 foralls€Q and E,(a')=1 foralltEN 


Es E,(a*)=1 foralls€Q and E,(a')=0 forall t€ N, 


(8) 


depending on the choice of o. Let us choose a so that (8) holds, then E,(x) is 


the idempotent of 2. Also 


E,(@')= V a"-Ma'-0 fort€N, 


nEN reQ 


and E,(a?) = 1 for s € Q. Thus E,(x) is the idempotent of ~N. 
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Finally, F,(a*) = 0 for s € Q and F,(1) = 0, so F,(x) is the idempotent of 9. 
Similarly for 2. Q.E.D. 


Note. If | 2 2 and p = 4k + 1, the idempotents of 2, 2, M, Ñ may be taken to 
be 


IX» x, Sele Sew dS x’ (9) 
reQ n€N n€N reQ 


respectively. 


Example. If | = 2 and p —23, then Q = {1, 2, 3, 4, 6, 8, 9, 12, 13, 16, 18} and the 
idempotents of 2 and W are 


€x and $x” (10) 


reQ reQ 


respectively. Both generate a code equivalent to the Golay code . 


Idempotents if 1 —2. In describing the idempotents for | 2, the following 
number-theoretic result will be useful. Recall from property (Q3) of Ch. 2 that 
the Legendre symbol x(i) is defined by 


0 if i is a multiple of p, 
x(i)= 1 ifi is a quadratic residue mod p, 
—] if i is a nonresidue mod p. 


Also x(i)x(G) = x(ij). Define the Gaussian sum 
pi! i 
0 = > x(a’. (11) 


Since 0' = 6, 0 E GF(I). 


Theorem 3. If p = 4k — 1, then 0? = — p. (This result holds if a is a primitive p™ 
root of unity over any field.) 


Proof. 
p-lp-l! 
P= 2, 2 xxMa™ 
rt j= 
The p — 1 terms in the sum with i +j = p all have coefficient — 1, for one of i 
and j is a residue and the other is a nonresidue modp (since —1 is a 
nonresidue). Hence 


8 =-(p-1)+> 2 xxi. 


itj¥p 
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The terms with i = j contribute 
p-! | p-l : p-1 
Y x(a" 2 a? - Yao CENE 
i=l i=t i=l 


Thus 


where 
Wk) = > x(i(k — i). 


It remains to show that j(k) 0. Let M, = (t: t — i(k—i) for some i with 
Ixizp-l, izk, 2i* k}. Then |M,|= Xp —3). We show that M, contains 
ip — 3) residues and Xp —3) nonresidues mod p, so that indeed (Kk) = 0. If 
t= i(k — i) then i?— ki + t 20, and X? - 4t = ((?— t)/i}. Since i? # t, k’— 4t E 
Q. Thus -4t 2»r- K&? for some r€Q, r#k’; or {-4M,}={r—k?: req, 
rz k^ By Perron's Theorem (Theorem 24 of the Notes), this set contains 
equal numbers of residues and nonresidues. Q.E.D. 


Note. If p = 4k +1, then 0? 2 + p. 


Example. When p = 7, for any l, 
0—-a-caoa?)—-a?* a*— at- ai, 


6 
0? =-6+ > ai =-7. 
izl 


Theorem 4. If l 2 and p — 4k € 1 the idempotents of 2, 2, N, N are 
R DIE a a e 000 
Boeg © 
Bw =5(142)+5(5+5) DG- Be" (14) 
Fay = 3(1- 4)—3(2-4) 2e -Ga £n (15) 


respectively. 


Proof. It is easy to check that for rE Q, 
E,(a’) = F(a’) — 0, E,(a’) = F(a’) = 1, 
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while if n € N, 
E,(a") = F,(a")=1, E,(a") = F(a") = 0. 


Also E,(1) = E,(1) = 1, F,(1) = F,(1) = 0. This proves that E,(x), F,(x), E.(x), 
F,(x) are the desired idempotents, from Lemma 2 of Ch. 8. Q.E.D. 


Example. If | 23 and p= 11, then Q = {1,3,4,5,9}. Also a € GF(3) is a 
primitive 11" root of unity. Rather than calculating 0 from (11) we shall use 
Theorem 3: 0 = — 11 = 1 (mod 3), so 0 = + 1. We choose a so that 0 = 1. Then 
from (12) and (14) the idempotents of 2 and M are 


-A* and oor (16) 


respectively. Both generate a code equivalent to the Golay code 4,,. Note that 
if | — 3 and p = 4k — 1 the idempotents will always be of this form, since 0? is 
a quadratic residue modulo 3, and so 0? - 1. 

It may be helpful to indicate how these idempotents were discovered. A 
plausible guess is that they have the form 


a+b> x «cy x. (17) 
reQ 


n€N 


To square such an expression we use: 


Lemma 5. Suppose p = 4k — 1. Then in R, 
2 
(£x) = Xp — 3) PEEL +1 > x’, 
2 
(= x") =p +1) X x +4p-3) Y x’, 
reQ n€N 


ne€N 


x' Y x-Kp-D*4p-3 x. 


reQ nEN 
Proof. From Perron’s Theorem 24. Q.E.D. 


It follows that (17) is an idempotent iff 
a^ 4 (p — Dbc- a, 
2ab + p —3)b? + Xp + Dc? + Xp —3)bc = b, 
2ac + (p + Db? - Xp — 3)c? + Xp —3)be =c. 


The solution of these equations gives the idempotents of Theorem 4. 
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$4. Extended quadratic-residue codes 


We first find the duals of the QR codes. 


Theorem 6. 
9'29, Nt=N if p=4k-1, (18) 
Q=N, N*=9 if p=4k+1. (19) 
In both cases, 
9 is generated by 2 and 1, (20) 
N is generated by N and 1. (21) 


Proof. Suppose p =4k-—1. The zeros of 2 are a@’ for rE Q. Hence by 
Theorem 4 of Ch. 7, the zeros of 2* are 1 and a^" for n € N. But - n € Q, so 


9-* — 9. From Theorems 2 and 4, 
1 p-ti ! 
E,(x) S FG) * 5 2, X5 


which implies (20). Similarly for the other cases. Q.E.D. 


Next we find generator matrices for these codes. Let 
p-1 : 
F(x) =D fat 


be the idempotent of 2, given by Theorem 2 or 4. Then a generator matrix for 


2 is the p Xp circulant matrix 


fo fi fi tees fa 
G= CREE oa) (22) 
f fr . fo 


=(g;), Osiüjsp-l with gj = fi-s 


and with subscripts taken mod p. A generator matrix for 2 is 


[a*4] (23) 


and similarly for M“ and N. Of course (22) has rank 3(p — 1). 


Examples. If | = 2, p = 7, a generator matrix for the [7, 4, 3] Hamming code 2 
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iS 
0123456 


(24) 
If !=3, p — 11, a generator matrix for the [11,6,5] Golay code $u is 
0123456789 10 
(25) 





Extended QR codes. QR codes may be extended by adding an overall parity 
check in such a way that 


($) -$,(4y 2 K, if p =4k-1, 


2 B 26 
(2) =N, if p = 4k +1, ey 
where ^ denotes the extended code. If a = (do,..., a4, )) IS a codeword of 2 
(or V), and p = 4k —1, the extended code is formed by appending 
p-1 
üs——Yy > ai, (27) 
i-o 


where 1+y’p =0. Since (yp =- p = 6^, it follows that y = e0/p, where 
€=+1 (either choice of sign will do). Note that y is chosen so that the 
codeword (1, 1,..., 1, —yp) of 2 (or Ñ) is orthogonal to itself. If | = 2 or 3, y 
may be taken to be 1. and Ñ are [p -- 1, (p + D] codes. 
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Theorem 7. If p = 4k — |, the extended QR codes 9 and Ñ are self-dual. 


Proof. A generator matrix for 2 is obtained by adding a column to (23), and is 
given by 


(28) 


C» 
i 





Lief yp] 


Since 2 C (2)', every row of G is orthogonal to itself and to every other row. 
Thus every row of Ó is orthogonal to all the rows of G, and so 2 C (2)'. But 
G has rank Xp +1), so 9 2 (y. 


Thus the extended Golay codes €, = £, and Gn = €, are self-dual. 


Note. If p = 4k +1, the extended codes may be defined so that ($) = WN. 


Theorem 8. (i) If 1 —2 and p =4k—1, the weight of every codeword in $ is 
divisible by 4, and the weight of every codeword in 2 is congruent to 0 or 
3mod4. (i) If 1=3 (and without restriction on p) the weight of every 
codeword in À is divisible by 3, and the weight of every codeword in 2 is 
congruent to 0 or 2 mod 3. 


Proof. (i) If 2 is a quadratic residue mod p, then by Theorem 23 p 2 8m +1. So 
we may assume p = 8m — 1. The number of residues or nonresidues is 4m — 1, 
and so the weight of each row of Ó is a multiple of 4. The result then follows 
from Problem 38 of Ch. 1. (ii) If a € Í then a+ a?+---+a2=0mod 3, by 
Theorem 7. But a; = +1, a?=1, and the result follows. Q.E.D. 


Note. If | 2 2 and p = 4k + 1, all we can say is that the weights of 9 are even. 


Application. Theorems 1, 8 establish that G3 and 4,, have minimum distances 
7 and 5 respectively. Thus €. only contains codewords of weights 0, 8, 12, 16 
and 24 (as we saw in Ch. 2), and nx only contains weights 0, 6, 9 and 12 (see 
Ch. 19). 


Problem. (1) Show that the weight distribution of a binary quadratic residue 
code is asymptotically normal. [Hint: Theorem 23 of Ch. 9]. 
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§5. The automorphism group of QR codes 


In this section it is shown that the extended QR code 2 is fixed by the large 
permutation group PSL.(p). 


Definition of PSL.(p). Let p be a prime of the form 8m + 1. The set of all 





permutations of (0, 1,2, ..., p — 1,9 of the form 
ay tb 
yey +a’ (29) 
where a, b, c, d € GF(p) are such that ad — bc = 1, forms a group called the 


projective special linear group PSL.(p) (sometimes also called the linear 
fractional group). 
The properties of PSL.(p) that we need are given in: 


Theorem 9. (a) PSL2(p) is generated by the three permutations 
S: yoytl 
V: yopy (30) 
T: y> z 
where p is a primitive element of GF(p). 
(b) In fact PSLAp) consists of the ip(p? — 1) permutations 
V'S': y>p”y+j 
V'S'TS': y>k-(p”y+ij)' 
where O0xi«xp—1,0xjikc«p. 


(c) If p =8m —1, the generators S, V, T satisfy S" = V^" = T?=(VTY = 
(STy 21, and V'SV = S”. 
(d) PSL.(p) is doubly transitive. 


Proof. (a), (b) A typical element of PSL:(p) 


ay+b 


ord ad -bc-1 


> 


can be written either as 
y>a°y+ab ifc=0 (for then d= lfa), 


or 
(steele. if c#0 (for then b = 4-7). 
c c’y+cd c c 
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and these are respectively V'S^ (where a =p‘), and V'S"TS^* (where 
€ =p’). (c) and (d) are left as straight-forward exercises for the reader. 
Q.E.D. 


The binary case of the main theorem is: 


Theorem 10. (Gleason and Prange.) If | 22 and p ^ 8m +1, the extended 
quadratic-residue code 2 is fixed by PSL,(p). 


Proof. S is a cyclic shift, and V fixes the idempotent, so 2 and hence 9 are 
fixed by S and V. From Theorem 9(a) it remains to show that 2 is also fixed 
by T. Only the case p = 8m — | is treated. We shall show that each row of the 
generator matrix (28) of 9, i.e. 


, (31) 





is transformed by T into another codeword of 2. (i) The first row of G, Ro 
say, is 


1+ Y x" Ol. 
Then 
T(R) =| Ð x'|1| 2 Ro* 1€ 9. 


(ii) Suppose s € Q; the (s +1)" row of 





G is 
"eX x 


We shall show that T(R,)  R.,, + Ro+1€ 9. It is just a matter of keeping 
track of where the 1’s go. T(R.) has 1’s in coordinate places —1/s and 
~1/(n+s) for n € N, which comprise œ (if n =—s), 2m — 1 residues and 
2m nonresidues (by Perron's Theorem 24). Also 


Rais = lx + > x^^'^|o| 
n€N 


has 1’s in places — 1/s and n — 1/s (n € N), which comprise 2 m residues and 
2m nonresidues. Therefore the sum T(R,)+ R.,, has a 1 in place © and a 0 in 
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place — 1/s (a nonresidue). If — l/(n + s) € N, then — l/(n +s) 2 n' —(l/s) for 
some n' € N, and the 1’s in the sum cancel. Thus the nonresidue coordinate 
places in the sum always contain 0. On the other hand, if — 1/(n + s) € Q, then 
— l/(n * s) € n' — l/s for all n' € N, and so the sum contains 1 in coordinate 
places which are residues. I.e., 


T(R,) + Reus =| > x" |1| = Rot 1. 
rEQ 


Similarly if t € N, 
T(R,) = Rov: + Ro. Q.E.D. 


From Theorem 18 of Ch. 8, the codewords of each weight in $ form a 
2-design. In some cases much stronger statements can be made - see the end 
of 88 below. 


Corollary 11. An equivalent code 2 is obtained no matter which coordinate 
place of 2 is deleted. 


Proof. From Corollary 15 of Ch. 8, since PSL;(p) is transitive. Q.E.D. 


The automorphism group of nonbinary QR codes. If 1 > 2 then (as defined in 
$5 of Ch. 8) Aut (2) consists of all monomial matrices which preserve 2. 
Since 9 is linear it is fixed by the | — 1 scalar matrices 


iB 0 


for B € GF(I)*. It is convenient to ignore these and to work instead with 
Aut (2)t = Aut (2)/scalar matrices. Clearly |Aut (2)| = (1— t!) [Aut (3). 


Theorem 12. (Gleason and Prange.) If 172, Aut (2)t contains the group 
isomorphic to PSL;(p) which is generated by S, V and T', where T' is the 
following monomial generalization of T: the element moved from coordinate 
place i is multiplied by x(i) for 1 «is p — 1; the element moved from place 0 
is multiplied by e; and the element moved from œ by * €; where € was defined 
in the previous section, and the + sign is taken if p — 4k t l1, the — sign if 
p 4k - I. 


Proof. Similar to that of Theorem 10. 
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Theorems 10 and 12 only state that Aut (2)* contains PSL.(p). There are 
three* known cases in which Aut (9)t is larger than this. For | = 2 and p^17, 
9 is the [8, 4, 4] Hamming code and | Aut (2)| = 1344, by Theorem 24 of Ch. 
13. For | 22, p=23 and 123, p - 11 2 is an extended Golay code and 
Aut (2)t is a Mathieu group (see Chapter 20). 

But for all other values of | and p, Aut (2)t is probably equal to PSL.(p). 
This is known to be true in many cases, for example: 


Theorem 13. (Assmus and Mattson.) If Xp — 1) is prime and 5 < p = 4079 then 
apart from the three exceptions just mentioned, Aut (2)t is (isomorphic to) 
PSL.(p). 


The proof is omitted. 


Problems. (2) Show that 2 and 2 have the same minimum distance, while that 
of 2 is 1 less. 

(3) Show that (29) belongs to PSL.(p) iff ad — bc is a quadratic residue 
mod p. 

(4) Show that the generator V in (30) is redundant: S and T alone generate 
PSL,(p). 

(5) If p 2 8m — 1, show that the relations given in Theorem 9(c) define an 
abstract group of order $p(p’— 1), which is therefore isomorphic to PSL.(p). 


Research Problem (16.3). Show that the conclusion of Theorem 13 holds for all 
p>23. 


§6. Binary quadratic residue codes 


Figure 16.3 gives a summary of the properties of binary QR codes. 


Examples of binary quadratic residue codes 2. 

(i) The [7, 4, 3] Hamming code (see p. 481). 

(ii) The [17,9,5] code with generator polynomial x*-4 x54 x*4 x? 1 and 
idempotent x "^ - x "- x" » x'-x'-x'-x'-x-l. 

(iii) The [23, 12, 7] Golay code $5 (see p. 482). 

(iv) The [31, 16, 7] code with generator polynomial x + x" - x - x^ - x^ 
x * 1l. The quadratic residues mod 31 are C, U C; U C3. 

(v) The [47, 24, 11] code with generator polynomial x? x"? 4 x ^ x'*4 
xx" EOE TE OEE EXPE KEI, 


*Four if the [6, 3, 4] code over GF(4) is counted - see p. 520 below. 
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2, 2, N, N have length p =8m +1 (Theorem 23) and generator 
polynomials 


q(x) = I] +a’), (x+Dq(x),  n(x))- I] « + a"), (x + Dn(x) 
(32) 


respectively, where a is a primitive p™ root of unity over GF(2). 2, 
N have dimension Xp +1); 2, consist of the even weight 
codewords of 2, M and have dimension 3(p—1). 2 and W are 
equivalent codes. For the minimum distance see Theorem 1. Ex- 
amples are given in Fig. 16.2. 

For p =8m — 1 the idempotents of 2, I, N, N may be taken to 
be 


Sx, 1+ Y x’. (33) 
reQ 


reQ 


For p 2» 8m +1 the idempotents may be taken to be 


14x, Xx, gx, Xx. (34) 
rEQ nEN nEN reQ 
If p - 8m - l, adding an overall parity check to 2, M gives self- 
dual codes 9, Ñ with parameters [p + l, Xp + 1)]. All weights in 9, 
A are divisible by 4. If p =8m + 1, (2) = Ñ. In both cases 9, Ñ 
are invariant under PSL,(p). 2 is a quasi-cyclic code with genera- 
tor matrix (31), (44), and sometimes (39) or (48). 


Fig. 16.3. Properties of binary QR codes. 





The first two codes with p —8m — 1 are both perfect codes. If € is any 
binary self-dual code with all weights divisible by 4, we will see in Ch. 19 that 
d = 4[n/24] + 4. This (upper) bound is considerably larger than the square root 
(lower) bound of Theorem 1, and is attained by the QR codes 2 of lengths 8, 
24, 32, 48, 80, 104 (and possibly others). This phenomenon has led to a 
considerable amount of work on determining the actual minimum distance of 
QR codes, some of which we now describe. 


Other forms for the generator matrix of 2. The generator matrix G for 9 
given in Eq. (31) is a (p+ I) x (p +1) matrix which only has rank X(p + 1). 
Sometimes it is easy to find a generator matrix for 2 in the form [I| A], where 
I is a unit matrix and A is either a circulant matrix or a bordered circulant. 


Canonical form 1 — two circulants. 


Lemma 14. For any prime p > 3, PSL.(p) contains a permutation m, consis- 
ting of two cycles of length Xp + 1). 
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Proof, As in $5 of Ch. 12 we represent the coordinates (0, 1,..., p —1,} on 
which PSL;(p) acts by nonzero vectors (3), where x, y € GF(p). Thus (}) 
represents the coordinate x/y. We also represent an element i 
(ai+ b)l(ci - d) of PSL.(p) by the matrix (2). We must find m, € PSL.(p) 
such that for all nonzero (5), 


" () " k(*) for some k € GF(p)* iff p + DIr. (35) 
Then m, does indeed consist of two cycles of length Xp + 1). 
Let a € GF(p?) be a primitive (p?— 1)" root of unity. Then n =a”? is a 


primitive Xp + 1)" root of unity. Set A = I/(1+ n). Then it is immediate that 
A'— (1—AY iff Xp + Dr. 

Now defire u = A? — A. Then —a is a quadratic residue mod p. For - a = 
a7 7[(Y +n) = b? where b = a*^'[(14 n), and b € GF(p) since 


2 
ar a? 
iain: 7714 ae 
n n 








Now we take m, =(} 4) € PSL(p). The eigenvectors of m, are given by 
Dy Of |» d RT 
mh, - )=a(, |) n(_,)=4 (^). 


m= BCG 0 jee where B - ( l x 


so 


0 1-A À-1 -AÀ 
peer O, 0 A 
zi = BG T. 3 


Therefore (35) holds iff A’ = (1 — A)’ = k, which from the definition of A is true 
iff Xp + DIr. Q.E.D. 


For example, if p —7, m, is i>(i+ 1)/3i, which consists of the cycles 
(©560)(1324). 
In general, let m, consist of the cycles 


(hie ++ Doan) oo Phe) (36) 

We take any codeword c € 2 and arrange the coordinates in the order 

hee dyipni ttt ross given by (369. Then the codewords c, mie, 
TiC...., mi ""c form an array 

[L[R], (37) 


where L and R are 3(p+1)xX3(p+1) circulant matrices. For example, if 
€ = (CoC ` * © C6Cm) = (01101001) is the extended idempotent of the [8, 4, 4] code 
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9 then [L|R] is 


G8) 





Of course (37) always generates a subcode of $. But if we can find a 
codeword c such that either L or R has rank Xp + 1), we can invert it (see 
Problem 7) and obtain a generator matrix for 2 in the first canonical form 


6, - [I| A]. (39) 


where A is a Xp + 1) x Xp + 1) circulant matrix. For example, (38) is already in 
this form. 

A code with generator matrix of the form (39) is called a double circulant 
code. We also apply the same name to codes with generator matrices (45), 
(46), etc., differing from (39) only in the presence of borders around various 
parts of the matrix. It will be shown in $7 that such codes are examples of 
quasi-cyclic codes. Further properties of these codes are given there. 


Example. The [24, 12,8] Golay code Gs. If p —23 the permutation m.: i> 
(i+ D/5i consists of the cycles 


Q1, 7, 16, 12, 19, 22, 0, ©, 14, 15, 18, 2)(20, 17, 4, 6, 1, 5, 3, 11,9, 13, 8, 10). 
(40) 


For c(x) we take the sum of the idempotent 


E(x)2 € x’ 


rEQ 


of G» and three cyclic shifts of E(x) (with an overall parity check added): 


©0123456789 10 11 12 13 14 15 16 17 18 19 20 21 22 
E(x) =1 1111 1] 11 1 1 l 1 
xE(x) = 1 1111 1 1] ! 1 1 l 1 
x'E(x)-1 1111 1 1 1 1 1 1 1 
XE(x)=1 1111 1 1 1 1 d» 1 l 
c(x)= 1 1 11 l 1 1 1 


Arranging the coordinates of c(x) in the order (40) we obtain the generator 
matrix for G4 shown in Fig. 16.4. 
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21 7 16 12 19 22 0 œ 14 15 18 2 20 17 4 6 11 9 13 8 10 





3 
1 
l 
1 
1 


(41) 





Fig. 16.4. A double circulant generator matrix for the Golay code 8. 


Research Problem (16.4). Is it always possible to find c such that either L or R 
in (37) is invertible? Find a general method for getting the canonical form (39) 
for an extended QR code. 

Canonical form 2: Two bordered circulants. The second canonical form arises 
because PSL(p) contains the permutation m: i>ri, where r=p’ is a 
generator of the cyclic group of quadratic residues mod p. The cycle structure 
of m is 


()(n, nr, nP?^, ..., nr? ?Py(1, r, P,..., r?Py(0), (42) 


where n is any nonresidue. (For p = 8m — 1 we may take n to be — 1). Let G, 
be obtained from G by arranging the rows and columns of G in the order 
shown in (42). 


Example. The generator matrix for the [8,4, 4] code ((24) with an overall 
parity check) becomes 


76531240 


6 
5 

6- : (43) 
2 
4 
0 
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In general we write G, as 


(44) 








where 


=f if p — 8m - 1, 
| lo if p=8m+1. 


Problem. (6) (a) Show that A, B, C, D are circulant matrices. 

(b) If p 28m- 1, show that C=C’=B+J and D=A7=A+CHYJ, 
where J is a matrix all of whose entries are 1. Also the rows of A, B and D 
have weight 2m and the rows of C have weight 2m — 1. 

(c) If p 28m +1, show that A= A’, B = C7, D = D', the rows of B, C and 
D have weight 2m, and the rows of A have weight 2m — 1. 


Now if p = 8m — 1 C may be invertible (A, B and D are certainly not since 


they have even weight), and if p = 8m +1, A may be. If so we readily arrive 
at the following canonical form for the generator matrix: 


(45) 





where T is a in x in unit matrix, T is a Gn — 1) x Gn — 1) circulant matrix (here 
n= p +1), and a, b, c are 0 or 1. This is called a (bordered) double circulant 
code. 


Examples. (1) For p — 7, C in (43) is already an identity matrix and we obtain 


(46) 
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(2) The Golay code 42, again. The first two rows of the generator matrix G, 
Equation (31), are: 


©0123456789 10 11 12 13 14 15 16 17 18 19 20 21 22 
l 1 | 1 1 1 1! l |o 1 ot !] 
11 1 1 1 1 1 1l i |od 1---(*) 


Using (42), we arrange the rows and columns in the following order: 


p 14 5 10 20 17 11 22 21 19 15 7 wore d LOC | 
1 111 1 1 11 


The first rows of C and D are obtained from (*) and are shown above. To 
find the inverse of C (see Problem 7) we write the first row of C as 
a(x) 2 x°+x°+x°+x7+x? and find the inverse of a(x) mod x''+ 1. This is 
14x x5 xà x*- x*- x". To find C^'D we compute 


(1o x x^ x^ x* x? eV + x! - x*- x*4 x? x *) mod x! 31 
=14x74+x°4x74+ x74 x" 
=14+ > x^ N -nonresidues mod 11. (47) 


n€N 


Thus 4€, has a generator matrix 


(48) 





where T = C^'D is the circulant with first row given by (47): 


012345678910 


10100011101 (49) 


This is equivalent to the definition of $2, given in Fig. 2.13 (see Problem 9). 
Thus we have shown that some binary QR codes are equivalent to (possibly 
bordered) double circulant codes. (The same procedure can be applied to 
non-binary codes.) 


Research Problem (16.5). Sometimes it is not possible to invert C or A (e.g. for 
p = 127 -see Karlin [715]). In this case is it still possible to find a generator 
matrix of the form (45)? 


Problems. (7?) - Circulant matrices. (i) Show that the algebra of n x n circulant 
matrices over a field F is isomorphic to the algebra of polynomials in the ring 
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F[x]/(x" — D), if the circulant matrix 


o a, an-ı 
Qn-1 ao 4-2 
AES eas aceites idus (50) 
ai 5 ^-^ o 
is mapped onto the polynomial 
a(x)2a,*auxc-:--ta, ax". (51) 


(ii) The sum and product of two circulants is a circulant. In particular 
AB =C where c(x) = a(x)b(x) mod x" - 1. 

(iii) A is invertible iff a(x) is relatively prime to x" — 1. The inverse, if it 
exists, is B where a(x)b(x) = 1 mod x" - 1. 

(iv) A7 is a circulant corresponding to the polynomial ao * a, x t: 
aux". 

(v) Let T be the circulant with a,— 1, a; 2 0 for ix 1. Show that A= 

hal". 

(vi) Let B be a primitive n root of unity in some field containing F. Show 
that a(1), a(B),..., a(B""') are the eigenvalues of A (see Problem 20 of Ch. 7). 

(8) Let »£ (resp. B) be a binary [2m, m] code with generator matrix [1| A] 
(resp. [1| B]) where A and B are circulants with first rows 


m-1 m-i 
a(x) = X ax, b(x) = > bix*. 
i-o i-0 
(i) Show that »£ and B are equivalent (a) if B = A’, or (b) if a, = bm-i-ı for all 
i (i.e. if b(x) is the reciprocal of a(x)), or (c) if a(x)b(x) = 1 mod x" — 1, or (d) if 
b(x) = a(xY and m is odd, or (e) if b(x) = a(x“) where u and m relatively 
prime. 
(ii) Show that 54^ is equivalent to £. 
(iii) Show that += s£ iff AA" =I. 
(9) Show that the generator matrix (48), (49) for $z is equivalent to the one 
given in Fig. 2.13. 
(10) Let € have generator matrix (45) with a = c = 1, b = 0. Show that €^ = € 
iff TT" 2 IJ. 


A method for finding low weight codewords in 2 (Karlin and & MacWilliams 
[718]). The generator matrix (45) makes it simpler to find the minimum distance 
of 2 or 9. Many of the results shown in Fig. 16.2 were found in this way. 

In some cases it is possible to find low weight codewords directly, as we now 
show. 

The permutation V: y 2 p!y, where p is a primitive element of GF(p), is in 
Aut (2). The order of this permutation is Xp — 1); if this is a composite number, 
say Xp — 1) = sf, then 2 contains codewords which are fixed by the permutation 
U:y 5 p"y. Let U be the subcode consisting of such codewords. We 
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describe how to find Y for p = 8m — 1. Let e = 2s, p — 1 = ef, where s and f are 
odd. Divide the integers mod p into e classes 


Ti-(p?^*:j20,L,..., f - 1l i-0,...,e- I. 
For example, if p =71, p—1=70= 14.5, e = 14, p — 7: 


To 7 (1, 7", 75, 7%, 7°} = (1, 54, 5, 57, 25} 
T, = (7,75, 72,79, 757} = (7, 23, 35, 44, 33} 


X= > x fori=0,...,e-1. (52) 


Problem 11. (i) Show that X; is fixed by U, and that any polynomial which is 
fixed by U is a linear combination of 1, X&..., X... 


(ii) Show that E,(x) = Xs Xo: Xs. 
(iii) Show that X;E,(x) ^ + X , eX, for some «m, e, in GFQ), for i= 


0,...,e- LI. 
(iv) Thus %, the subcode of 2 fixed by U, consists of linear combinations of 
E,(x) and X,E,(x) for i=0,...,e-1. 


The coefficients s; and e; are shown in Fig. 16.5. 
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(These are not the same as the matrices A and B in (44).) Then 


(53) 





is a generator matrix for U. We state the following results without proof. G isa 
Qs + 1) x Qs + 1) matrix with rank s + 1. A, B are circulant matrices, and the 
first rows are given by the following easy formula: 
Let 
imis f if m is a quadratic residue mod p, 
— |0 if m is a nonresidue or 0. 

Then f » 

a-[p*-1]  b-[-p"-l) i-L...,s—1, 

à, = s + 1) mod 2, b, = 0. 


Thus % may be regarded as a [2s + 1, s + 1] code. The connection with 2 is that 
if 7 +X, +°--+X, is a minimum weight codeword in %, then 2 contains 
codewords of weight 7 + sf. Thus the minimum weight of % provides an upper 
bound to the minimum weight of 2. In fact in all cases for which we know the 
minimum weight of 2, this weight is found in %. 


Example. p =71, e = 14, f =5, p =7. 


To find the first rows of A and B we calculate, using Vinogradov (1371, pp. 
220-225], 


49—1-44€ N -49-1-25€Q 
49"—1-36€6Q —49°-1=33EN 
495-1=31EN . -495-1-39€Q 
49^—1-19€Q -—49°-1=50EQ 
495—1-17€N | -495-1- 52€N 
49"—1-29€Q —49°-1=40EQ, 


and the first rows of A, B are therefore 


MEM ol PEN 
OF OP 1/0] 11 O] 1|" OL 1, 0j 1| 11 OP 1|. 
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Notice that all that is needed to write down the matrix G is a table of powers of 
p. Further the same matrix G will appear for many different primes p. 

The matrix A is frequently invertible, in which case one finds a generator 
matrix for Y in the form 


r 0 





(54) 





= 


where R is an orthogonal circulant matrix. 


Example. (Cont.) The inverse of the polynomial, x?+ x* - x^ mod x’+1 is 
Xx t x5 x) x** x. Thus G becomes 


0]1111111[|0000000 


0 
0 
0 
0 
0 
0 
0 
l 


m | m p 2? pt pt p pat 
wt f n C t p p p pt 
-|O m e e e e 
——————— cc 
—————— 0 -— 
-————c—-- 


|IL11IT11 





The sum of the first, second and last rows of this matrix is 1 + Xo + X13, so 2 isa 
[71,36] code which contains codewords of weight 1+ 2.5— 11. 

The procedure for p ^ 8m + | is similar. 

Some negative results obtained in this way are shown in Fig. 16.6. 





p bound 
p^8m-1 631 83 
727 67 
751 99 
99] 9] 
p=8m+1 233 25 
241 41 
257 33 
761 75 
1361 135 


Fig. 16.6. Upper bounds on the minimum distance of 2. 
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Research Problem (16.6). If 2 contains nontrivial subcodes %, is a codeword of 
minimum weight always contained in one of these subcodes? 


Another method of using the group to find weights in 2. A powerful method for 
finding the weight enumerator of QR and other codes has been used by 
Mykkeltveit, Lam and McEliece [979]. It is based on the following. 


Lemma 15. Let € be a code and H any subgroup of Aut (€). If A; is the total 
number of codewords in € of weight i, and A((H) the number which are fixed by 
some element of H, then 


A; = A(H) mod | H|. 


Proof. The codewords of weight i can be divided into two classes, those fixed 
by some element of H, and the rest. If a € € is not fixed by any element of H, 
then the |H| codewords ga for g € H must all be distinct. Thus A; = A:(H)+ 
a multiple of |H |. 


For a prime q dividing | Aut (€)|, let S, be a maximal subgroup of Aut (€) 
whose order is a power of q (S, is a Sylow subgroup of Aut (€); these groups 
are especially simple if Aut (€) — PSL;(p)) By letting H run through the 
groups S, the lemma determines .A, mod q^ and hence determines 
A, mod | Aut (@)|, from the Chinese Remainder Theorem (Problem 8 of Ch. 10). 
If | Aut (€)| is large this often enough to determine A, exactly. See [979] for 
examples. 


Problem. (12) For some values of p in Fig. 16.2, 9 has d' > s, in the notation of Ch. 
6. Find these values and investigate the designs formed by the codewords. 


$7. Double circulant and quasi-cyclic codes 


A double circulant code was defined in the previous section to be a code 
with generator matrix of the form 
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where A is a circulant matrix with first row a(x) = aot aix + a;x^ +: -:,and a, 
b, c are O or 1. We use the same name for various codes obtained from (55), for 
example that obtained by deleting the first row and the first and last columns. 
The latter transformation changes (55) to 


[I|A] (56) 


where / and A are k x k matrices: (56) generates a [2k, k] double circulant code 
which we denote in this section by 9,. Some properties of 2, are given in 
Problem 8. 


Problem. (13) Show that 2, is particularly simple to encode: the message u(x) 
becomes the codeword |4(x)| u(x)a(x)|. 


Quasi-cyclic codes. A code is called quasi-cyclic if there is some integer s such 
that every cyclic shift of a codeword by s places is again a codeword. Of course 
a cyclic code is a quasi-cyclic code with s = 1. 

By taking the columns of (56) in the order 1,k + 1,2, k - 2,... we see that a 
double circulant code 2, is equivalent to a quasi-cyclic code with s = 2. For 
example (46) gives 


110001 
011100. 
000111 


which is indeed quasi-cyclic with s = 2. Also Lemma 14 and Equation (44) 
show that any binary QR code 2 with one coordinate deleted, or any extended 
code 9, is equivalent to a quasi-cyclic code with s = 2. 

Since many of the codes in this Chapter have the form (56), we know that 
there are good codes of this type. In fact some long double circulant (or 
quasi-cyclic) codes Y, meet the Gilbert-Varshamov bound. The following 
theorem is a slightly weaker result. 


Theorem 16. If k is a prime such that 2 is a primitive element of GF(k), and if 


1+ (F) (4 ex ioe ))e 2"'", where 2r « k, 


then there is a [2k, k] double circulant binary code 92, with minimum distance at 
least 2r. 


Proof. Choose a(x) to have odd weight less than k. The number of such codes 
9, is 2^ — 1. From the assumption about k, the complete factorization of 
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x*+1 over GF(2) is (x +1) Zé x'. Hence a(x) has an inverse in the ring 
GFQ)[x]/(x* + 1]. 

Let |I(x)| r(x)| be a typical codeword of Da; by Problem 13, r(x) = l(x)a(x). 
Hence wt (/(x)) and wt (r(x)) have the same parity. Note also that Z, contains 
the codeword 1 (the sum of the rows of (56)). 

If I(x) has odd weight less than k, then |/(x)|r(x)| for a given r(x) is in 
exactly one code Da, namely the one for which a(x) = I(x) 'r(x). 

If I(x) has even weight then again |!(x)|r(x)| is in exactly one a, namely 
that for which 


a(x) = (1+ (x) 2 r(x)). 


Therefore each |/(x)|r(x)| with weight less than k is in exactly one code Da. 
The remainder of the proof is as usual- cf. Theorem 31 of Ch. 17. Q.E.D. 


Unfortunately it is at present only a conjecture that there are infinitely many 
primes k such that 2 is a primitive element of GF(k). If this were proved, then 
we could simply take k to be arbitrarily large in Theorem 16 and deduce that 
some very large codes D, meet the Gilbert- Varshamov bound. But even without 
this result Kasami has used Theorem 16 to show that some codes D, meet the 
Gilbert-Varshamov bound - see [730] for the proof. 


The double circulant codes B. We now define a certain family of binary double 
circulant codes of length 8m, where q = 4m — 1 is a prime of the form 8k +3 
(but now 8m — 1 need not be a prime). 


Definition. Let B be the [8m = 2q +2, q+ 1] code with generator matrix 


lo lo L us. la To Vo Vy «+ * a-i 


(57) 





where A is a circulant matrix which in the first row has a 1 in position r; iff i = 0 
or i is a quadratic residue mod q, i.e. 


a(x)=1+ Y, x’. 


reQ 
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For example, if q — 3, 


(58) 





which of course is equivalent to the extended Hamming code (46). For q = 11 
we obtain the [24, 12, 8] Golay code, as we saw in Problem 8. 


Probiem. (14) Show that 2) is self-dual and has all weights divisible by 4. 


If q = 19 B is a [40, 20, 8] code, and if q = 43 B is a [88, 44, 16] code, both of 
which have the greatest possible minimum distance for self-dual codes with 
weights divisible by 4, as will be shown in Ch. 19. The weight distributions of 
these codes will also be given there. 

B can also be considered as an extended cyclic code over GF(4). Let 
w € GF(4) be a primitive cube root of unity, with w° + w + 1 = 0, and change l;, r; 
into v, = liw + rw’; i.e. 


lr vu 
0 00 
1 0o 
0 low? 
1 191 


Problem. (15) Show that this mapping sends & into an extended cyclic code 2 
over GF(4). The idempotent of 2 is 


w? > X +w > x", 
rEQ nEN 


where Q and N are the quadratic residues and nonresidues mod q. Show that D 
is obtained by extending the cyclic code with generator polynomial IL.eo (x + 
a’), where a is a primitive q" root of unity in GF(4‘). (Thus we may say that D 
is an extended quadratic-residue code over GF(4).) 


Example. Consider a code in which the first row of the generator matrix is 


1100000000000110111000190. 
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(This is the Golay code.) This maps into the GF(4) vector 

w lo 0 w& ww 000 ow’ 0 
By adding the vector 1 and multiplying by w we obtain the (extended) 
idempotent 


0 w? e ON” PENE i 2 
1 O OW o O 0 0 0 WwW w. 


Problems. (16) Show that Aut (23) contains (i) PSL.(q) applied simultaneously 
to both sides of (57), and (ii) the permutation 


(s Folo Foli ra-)(l2 Fa-2) +++ (a-i ri) (59) 
which interchanges the two sides. 
(17) Show that 2, the code over GF(4), is fixed by PSL;(q) and by the 
generalized permutation (v.)(vo)(viv4 :1)(v20,.2) - with w and c? interchan- 
ged. 


Research Problem (16.7) Is there a square root bound for the minimum 
distance of B, analogous to Theorem 1? 


We conclude this section with a table (Fig. 16.7) showing some examples of 
good double circulant codes. The first row of A, a(x) 2 ao+ aix +- : - is given 
in octal as /asa,a;/a5a,a.| - - - E.g. 426 stands for 1+x*+x°+ x^. 


n k d Definition a(x) n k d Definition a(x) 
6 3 3 (56) 3 28 14 8 (56) 727 
8& 4 4 (38) 7 30 15 8 (56) 2167 

10 5 4 (56) 7 31 16 8 (57)* 23642 

12 6 4 (56) 6 32 16 8 (56) 557 

14 7 4 (56) 77 39 20 8 (57)* 236503 

16 8 5 (56) 426 40* 20 8 (57) 636503 

18 9 6 (56) 362 40 20 9 (56) 5723 

20 10 6 (56) 75 42 21 10 (56) 14573 

22 11 7 (56) 355 56" 28 12 (57) 

24" 12 8 (41) 675 64" 32 12 (57) 

24" 12 8 (48) 5072 or 6704 

26 13 7 (56) 653 88^ 44 16 (57) $ 

28 14 8 (57) 26064 108 54 20 (57) 


*Self-dual, weight divisible by 4. 
*With first column deleted. 
§a(x) = 62473705602153. Fig. 16.7. Good double circulant codes. 
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§8. Quadratic-residue and symmetry codes over GF(3) 


Examples of extended QR codes over GF(3) are given in Fig. 16.2. These 
codes have length 12m or 12m +2, by Problem 25 below, and have all weights 
divisible by 3 (Theorem 8). The first is the Golay code G2 (see p. 482). The 
minimum distances of the [24, 12, 9], [48, 24, 15], and [60, 30, 18] codes were 
found with the aid of a computer. As will be shown in Ch. 19, these are the 
largest possible minimum distances for self-dual codes over GF(3). 


Pless symmetry codes. These are double circulant codes over GF(3) (see Fig. 
16.8). 


Definition. Let p be a prime =— 1 (mod6), and let S be the following 
(p + D X(p +1) matrix (with rows and columns labeled ~,0,1,...,p—1): 


(60) 





where e= lif p=4k+1, e=—1 if p ^ 4k — l, and C is a circulant matrix in 
which the first row has a 0 in column 0, a 1 in columns which are quadratic 
residues mod p, and a — 1 for the nonresidues. Then the Pless symmetry code 
S,.2 is the [2p + 2, p + 1] code over GF(3) with generator matrix [I| S], where 
I is a (p - Dx(p +1) unit matrix. 


Example. For p = 5, Si; has generator matrix 


o 01234 





(61) 
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Problems. (18) Show that S;; is equivalent to the Golay code 9p. 
(19) (i) Show that SS" = — I (over GF(3)). 
(a) If p=4k+1, S= S7 =- S~; if p=4k-1, S=- S" = S`. 


Theorem 17. $;,.; is a self-dual code with all weights divisible by 3. 
Proof. By Problem 19. Q.E.D. 


Examples. The first five symmetry codes have parameters 
(12, 6, 6], [24, 12, 9], [36, 18, 12], [48, 24, 15], [60, 30, 18] (62) 


Sos, Sag and Seo have the same Hamming weight enumerator as the extended 
QR codes with the same parameters (see Ch. 19), but are not equivalent to 
these codes (Problem 20). Note that these 5 codes have d = in +3, which is 
the largest it can be (see Ch. 19). Unfortunately the next code is an [84, 42] 
code for which it is known that d x 21. 


Theorem 18. The automorphism group of S;,,.;, contains the following 
monomial transformations: if a codeword | L|R| is in S2,,2 so are 
-ILIRI IRl|l-e«erL | SCL) S(R)], 
|V(L)|V(R)| and |T(L)|T'(R)|, 


where S, V, T’ are as in Theorem 9. Hence Aut (S,,,;) contains a subgroup 
isomorphic to PSL,(p). 


For p =— 1 (mod 6), S,,.; is a [2p +2, p +1] double circulant, 
self-dual code over GF(3) with generator matrix [I| S], S given by 
(60). Aut (S;,.;) contains PSL,(p), and if | L| R| is a codeword, so is 


| R|— «eL |. (62) gives examples. 


Fig. 16.8. Properties of Pless symmetry codes. 





Problems. (20) Use Theorems 12 and 13 to show that 9 and S2,+2 are not 
equivalent for 2p + 2 = 24, 48 and 60 [Hint: PSLp) Z PSL;Qp + 1).] 

(21) Show that any extended QR or symmetry code of length n = 12m over 
GF(3) contains at least 2n codewords of weight n. In fact there is an 
equivalent code which contains the rows of a Hadamard matrix of order n 
and their negatives. 


Research Problem (16.8). How does d grow with p in S,,.;? 
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5-designs. When the results of Ch. 6, especially Theorem 9 and Corollary 30, 
are applied to the codes in Fig. 16.2 and (62), a number of 5-designs are 
obtained. The parameters are determined by the weight enumerators of these 
codes (derived in Ch. 19) and are shown in Fig. 16.9. For example the second 
line of the figure means that the codewords of weights 8, 12, 16 and 24 in the 
(24, 12, 8] extended binary Golay code form 5-designs, and the first of these is 
a 5-(24, 8, 1) design (in agreement with Theorems 24-26 of Ch. 2). See also (67) 
below. 


(i) From QR codes 


design from these weights also 
[n, k, d] field min. wt. words give 5-designs 
[12, 6, 6] GF(3) 5-(12, 6, 1) 9 
[24, 12, 8] GF(2) 5-(24, 8, 1) 12, 16, 24 
(24, 12, 9] GF(3) 5-(24, 9, 6) 12, 15 
[48, 24, 12] GF(2) 5-(48, 12, 8) 16, 20, 24,..., 36, 48 


(48, 24, 15] GF(3) 5-(48, 12, 364) 18, 21, 24, 27 
(60, 30, 18] GF(3) 5-(60, 18, 1530) 21, 24, 27, 30, 33 


(ii) From symmetry codes 
[24, 12, 9] GF(3) 5-(24, 9, 6) 12, 15 
[36, 18, 12] GF(3) 5-(36, 12, 45) 15, 18, 21 
[48, 24, 15] GF(3) 5-(48, 12, 364) 18, 21, 24, 27 
[60, 30, 18] GF(3) 5-(60, 18, 1530) 21, 24, 27, 30, 33 


Fig. 16.9. 5-Designs from codes. 


$9. Decoding of cyclic codes and others 


Encoding of cyclic codes was described in $8 of Ch. 7, and specific 
techniques for decoding BCH codes were given in $6 of Ch. 9, alternant codes 
in $9 of Ch. 12, and Reed-Muller codes in §§6,7 of Ch. 13. In this section 
several decoding methods are described which may be applicable to cyclic or 
double circulant codes. 


Let € be an [n, k, d] binary code. Suppose the codeword c = CoC, ++ + c, is 
transmitted, an error vector e = &9€,::- &,-, occurs with weight t where 
d z 2t t 1, and the vector y ^ c +e = yoyi *** Ya-ı is received. For simplicity 


we assume that € has generator matrix G — [A|I] and parity check matrix 
H - [I|A"]). Thus co:-: c,- are the r= n — k check symbols in the codeword 
and c, --: Ca-ı are the k information symbols. For example we might be using 
Encoder #1 of $8 of Ch. 7. 

The following theorem is the basis for these decoding methods. 


Ch. 16. §9. Decoding of cyclic codes and others 513 


Theorem 19. Suppose an error vector e = e€: : : én-1 Of weight t occurs, where 
2t t 1 « d. Let y be the received vector, with syndrome S = Hy’. If wt(S) St, 
then the information symbols y, - - - Ya-ı are correct, and S = (es: * - e,.3)' gives 
the errors. If wt (S) t, then at least one information symbol y((r is n-— 1) 
is incorrect. 


Proof. (i) Suppose the information symbols are correct, i.e. e; - 0 for rsis 
n-1. Then S-[I|A?]e" 2(e:-:6&.,)" and wt(S)<t. (i) Let ey= 
€o°** G1, Oy = €, ** eni, and suppose eo #0. Consider the codeword c’= 
€aG = eol A| I] =|eAlea|. Since c' #0, wt(c)z2t-*1 by hypothesis. 
Therefore wt(@)+ wt(e@,A)22t+1. Now S= He" = eit Aei, so 
wt (S) = wt (Aen) — wt (e) = 2t + 1 — wt (és) wt (eo) = t+ 1. (Q.E.D. 


Decoding method I: permutation decoding. This method applies to codes with 
a fairly large automorphism group, for example cyclic codes or, even better, 
quadratic-residue codes. 

Suppose we wish to correct all error vectors e of weight x f, for some fixed 
t «€ [Xd — 1)]. In order to carry out the decoding, we must find a set P = 
(mi = 1,75,...,7,] of permutations which preserve € and which have the 
property that for any error vector e of weight = t, some m: € P moves all the 
I's in e out of the information places. 


Example. Let € be the [7, 4, 3] Hamming code, with t — 1. Then a suitable set 
of permutations is P = (1, S", S$}, where S is the cyclic permutation, as shown 
by Fig. 16.10. E.g. if e — 0001000, S* moves the 1 out of the information 
places: Se = 0010000. 


Information 
Places 
eon 
1 + ¢ = Co€1€2 €3C4€5Ce 
S^. c = C405 C6 CoC1C3C3 
S$ - € = c6; C4C5sC6Co 


Fig. 16.10. The three permutations for permutation decoding of (7, 4, 3] code. 


The decoding method is as follows. When y is received, each zy and its 
syndrome S? = H(my)’ in turn is computed, until an i is found for which 
wt (S?) x t. Then from Theorem 19, the errors are all in the first r places of 
my, and are given by eo- e, = S". Therefore we decode y as 


c = mi (ny + €: 640-0). (63) 


If wt(S®)>t for all i, we conclude that more than t errors have occurred. 
Very little work has been done on finding minimal sets P. 
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Error trapping (Rudolph and Mitchell [1133]). An example of permutation 
decoding in which only the cyclic permutation S is used. Thus P = 
{1, S, S’,..., S". This method will correct all error vectors e which contain 
a circular run of at least k zeros, i.e. contain a string of at least k consecutive 
zeros if e is written in a circle. 


Problem. (22) Show that all errors of weight x will be corrected, where 
t = [(n — D/k]. 


So this method is only useful for correcting error vectors with a big gap 
between the l's—i.e. for correcting a small number of random errors, or a 
burst of errors. For example if applied to the [23, 12, 7] Golay code, t = [13] = 
1, so some errors of weights 2 and 3 will not be corrected. 

However, the method is very easy to implement. All we do is keep shifting 
y until the syndrome of some shift S'y has weight = t, and then decode using 
(63). Note from Problem 36(b) of Ch. 7 that if the division circuit of Encoder 
#1 is used to calculate the syndrome of y, it is easy to get the syndromes 
corresponding to cyclic shifts of y. These are simply the contents of the shift 
register at successive clock cycles. 


Examples. (1) The (31, 21, 5] double-error-correcting BCH code. Here [39] = 1, 
so error trapping will not correct some double errors. However, suppose we 
are willing to use the permutation e; which sends yoyi- Yso to ZoZi* * ^ Z30 
where zz = y, (subscript mod 31). a; preserves the code, as shown in $5 of Ch. 
8. Also a2— 1. It is not difficult to show that the set of 155 permutations 
P -(S'al0Oxix30,0xj <4} will move any error pattern of weight 2 out of 
the information places, and hence can be used for permutation decoding. 
(2) The [23, 12, 7] Golay code. Again error trapping can only correct 1 
error. However the set of 92 permutations P —-í(S'ai0xix22, 
j =0, 1, 2 or 10} will move any error pattern of weight <3 out of the information 
places (see MacWilliams [873]), and can be used for permutation decoding. 
Neither of these sets of permutations is minimal. 


Research Problem (16.9). Find a minimal set P of permutations for decoding 
(a) the Golay codes 4, and G4, (b) the [31,21, 5] BCH code, (c) the 
48, 24, 12] code 2, etc. 


Decoding method II: covering polynomials (Kasami [725]. This is a 
modification of error trapping which permits more errors to be corrected. 
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Suppose we wish to correct all error patterns of weight x t, where now t is 
any integer such that 2t + 1 « d. 
To do this we choose a set of polynomials 


Qo(x) = 0. Qí(x),..., Q. (x), 


called covering polynomials, with the following property. For any error vector 
e(x) of weight = t, there is a Q;(x) such that some cyclic shift of e(x) agrees 
with Q;(x) in the information symbols. For example, if e(x) contains a circular 
run of k consecutive zeros, there is some cyclic shift of e(x) which agrees 
with Qo(x) = 0 in the information symbols. 


Example. For the (23,12, 7] Golay code $,. with t —3, we may choose 
Q(x) 2 0, Q((x) = x'5, Q(x) =x". For it is easy to check that for any error 
vector e(x) of weight <3, drawn as a wheel on the left of Fig. 16.11, we can 
turn the wheel so that it agrees with one of the three wheels on the right in the 
information places 11-22. E.g. if e(x) — x*- x - x”, the cyclic shift x "e(x) 
has just one nonzero symbol in the information places, in x, and agrees with 


Q(x). 





CHECKS CHECKS CHECKS 


ERROR VECTOR 
e(x) Q(x)? Q, (x) Q, (x) 


Fig. 16.11. Covering polynomials for Golay code. 


Let q;= HQ; be the syndrome of Q, j=0....,a. The proof of the 
following theorem is parallel to that of Theorem 19. 


Theorem 20. Let S = Hy’. The information symbols (i.e. the last k symbols) of 
y (x) + Q(x) are correct iff wt(S" + qi) s t — wt (Qi(x)). If this is the case then 
the errors in y(x)+Q,(x), which are in the first r symbols, are given by 
€o*:* 6&4 S" +q, and the errors in y(x) itself are given by Qi + S" 4 q; 


In view of this theorem we have the following decoding algorithm. Keep 
shifting y(x) until some shift, x'y(x) say, has syndrome S" such that 


wt (S° + qi) « t - wt(Qi(x) 
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for some j =0,1,...,a. Then decode y(x) as 


c(x) = x" “(x'y(x) + Qi + S7 +q). (64) 


Problem. (23) Show that Q(x) 2 0, Qi(x) 2? x? and Qx(x) 7» x? are covering 
polynomials for the [31, 16,7] BCH code with t =3. 

Very similar decoding methods have been proposed by Karlin [716], Tzeng 
and Zimmerman [1354], and Berlekamp [123]. For example instead of shifting 
the received vector and adding the syndromes of the covering polynomials, 
one can add all vectors of weight x t/2 to the information places or the check 
places until a syndrome is found with weight x t. 

As an illustration we give Berlekamp's decoding method for the [24, 12, 8] 
Golay code gx, using the generator matrix (48), which we write as 





= (Wy,..., Uiz, bi3,.. bz), 


giving names to the columns. Thus u; is a vector with a 1 in coordinate i. 
Since €, is self-dual, GG’ =0, hence I + BB’ =0 or B! - B". 
Also let 


giving names to the rows. 
Suppose the codeword c =(¢.,...,C24) is transmitted, the error vector 


e = (6,..., €24) occurs, and y =c +e is received. The syndrome is 
S = Hy” = He” 
12 24 
= > eiui; + > €ib;. (65) 
i=1 i=13 
Also 


B'S = B'He* - [B' |I]e* 
12 24 


LA eibi + 5 €iUi-12. (66) 


i=l i=13 


Suppose that wt (e) <3. Then at least one of the following conditions must 
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hold: 
12 
Case I:  wt(e5,...,e:) = 0, wt (S) € 3, > eu = S. 
i=l 
24 
Case II: wt(e,...,@2) =0,  wt(B7S)=3, » e&t- = B'S. 
i=1 
Case III: wt(é3,..., 4) = 1, so for some j, 13 <j = 24, 


12 
wt(S+b)<2 and > eu =S+b,. 


Case IV: wt(eé,..., é:2) = 1, so for some j, 1 <j <12, 


24 
wt(B'S+b)<2 and > eu-2= BS +5, 
i=13 


Thus the decoding can be done by computing the weights of the 26 vectors 
S, S+b; (1<j<12), B'S, B'S* b; (Ixjx12) 


Example. S^ = 11100---0, wt(S) = 3, so case I applies. S" = u, + u;^ us, so 
e=11100---0. 


Decoding method III: using t-designs (Assmus et al. [33, 42, 43], Goethals 
[492, 493]). Threshold decoding (see 886,7 of Ch. 13) is often possible if the 
dual code contains a t-design. The idea is as follows. Suppose the codewords 
of some weight w in the dual code form a t-design, for some t. Let hi,..., ha 
be the blocks of this design (= codewords of weight w in €^) which have a 1 
in the first coordinate. Form the a parity checks y: h, where y is the 
received vector, and let v be the number of times y - hi = 1. 

The decoding algorithm we seek is of the following form: 

(i) If v = 0, no error has occurred; 

(ii) If v < 6, or if v belongs to some set of exceptional values W,, then the 
first coordinate is correct; 

(iii) If v = 0; or if v € V, then the first coordinate is incorrect; 

(iv) For all other values of v, decide that an uncorrectable error has 
occurred. 

Repeat for all coordinates of the received vector. 

We illustrate by giving a threshold decoding algorithm for the Golay code 
G (from [493]). Since G2, is self-dual, any codeword can be used as parity 
check. Let us see how to decode one coordinate, say the first. Take as parity 
checks the 253 codewords of weight 8 containing the first coordinate (from 
Fig. 2.14). If there is exactly one error, the number v of parity checks which 
fail is either 253 or 77, depending on whether or not the first coordinate is in 
error. If two errors occur, v is either 176 or 2- 56— 112 in the two cases. 
Finally if there are 3 errors, v is either 120 + 21 = 141 or 3 - 40 + 5 = 125inthetwo 
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cases. To summarize: 


v if first coordinate 


number of errors in error not in error 
1 253 77 
2 176 112 
3 141 125 


Therefore there is a simple threshold test: if more than 133 out of the 253 
parity checks fail, the first coordinate is in error, if less than 133 fail this 
coordinate is correct. 

This test can then be applied successively to all the coordinates (using the 
fact that Aut (24) is transitive, Theorems 9 and 10). 

A more complicated algorithm for the [48, 24, 12] QR code is given in [43]. 

It appears to be difficult to find a general threshold decoding algorithm of 
this type, or even to decide how many errors such an algorithm can correct. 
Rahman and Blake [1086] have given a lower bound on this number in terms of 
the parameters of the t-design. 


Research Problem (16.10). If the codewords of each weight in €^ form a 
t-design, how many errors can be corrected by threshold decoding? 

Of course if we allow arbitrary weights to be given to the parity checks we 
know from Theorem 21 of Ch. 13 that in theory [Xd — 1)] errors can be 
corrected for any code (but we don't know how to do it). 


Notes on Chapter 16 


Further properties of quadratic residues (cf. $3 of Ch. 2) General references 
are LeVeque (825, Vol. I] and Ribenboim [1111]. We begin by proving the 
fundamental theorem that 2 is a quadratic residue mod p if p = 8m + 1, and a 
nonresidue if p = 8m +3. First some lemmas 


(ip -1y2 


Lemma 21. (Euler's criterion.) For any integer a, y(a)=a mod p if p is 


odd. 


Proof. The result is obvious if a is a multiple of p, so assume a= 0 mod p. Let 
p be a primitive element of GF(p); then p^"? = — 1 mod p. Now a = p' mod p 
for some i and so a^^" «(— 1) mod p. From $6 of Ch. 4, a is a quadratic 
residue iff i is even. Q.E.D. 


Lemma 22. (Gauss' criterion.) Let p be an odd prime and let u be the number 
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of elements of the set a, 2a,..., Xp —1)a whose numerically least residue 
mod p is negative. Then x(a) =(— 1). 


Example. p = 7, a —2. The set 2, 4, 6 becomes 2, - 3, — 1, and u =2. 


Proof. Let r,r,...,—ri,—rj,... be the values of the numerically least 
residues of a,2a,...,Xp — l)a. Clearly rz r, riz rj if iz j, and rz ri; for if 
mia=r,modp and m.a=-rj;modp then r,—rj implies (m,+m)a= 


0 mod p, or mı mz is divisible by p. This is impossible since m; * m; p — 1. 
Hence the Xp — 1) numbers r,r; are distinct integers between 1 and Xp — 1), 
so are the numbers 1,2,... Xp — Í) in soine order. Thus 


a-2a---Xp- Dazs(- (254) 1moa p 


or 


a^^? z (— 1)* mod p. 


The result now follows from Lemma 21. Q.E.D. 


Theorem 23. 2 is a quadraiic residue mod p if p ^ $m + 1, aid a nonresidue if 
p =8m +3. 


Proof. Set a =2 in Lemma 22. u is the number of integers 2m(1 «m « 
Xp — 1)) such that 2m >4p, or m > pl4. Thus » = p — 1) - [p/4]. The four 
cases are 


p H 
8m —3 4m —2-(2m-1)=2m-1 
8m — 1 4m—-—1-(m-1)=2m 
8m +1 4m — 2m — 2m 
8m +3 4m *11—2m —-2m «* 1 





Q.E.D. 


Theorem 24. (Perron.) (i) Suppose p —4k—]. Let ri...,rx be the 2k 
quadratic residues mod p together with 0, and let a be a number relatively 
prime to p. Then among the 2k numbers r,+ a there are k residues (possibly 
including 0) and k nonresidues. 

(ii) Suppose p = 4k — 1. Let ni, ..., M2x-1 be the 2k — 1 nonresidues, and let 
a be prime to p. Then among the 2k —1 numbers ni * a there are k residues 
(possibly including 0) and k —1 nonresidues. 

(iii) Suppose p = 4k + 1. Among the 2k + 1 numbers ri + a are, if a is itself a 
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residue, k+1 residues (including 0) and k nonresidues; and, if a is a 
nonresidue, k residues (not including 0) and k + 1 nonresidues. 

(iv) Suppose p =4k +1. Among the 2k numbers n; * a are, if a is itself a 
residue, k residues (not including 0) and k nonresidues; and, if a is a 
nonresidue, k + 1 residues (including 0) and k — 1 nonresidues. 


For the proof see Perron [1035]. 


Problems. (24) Gauss' quadratic reciprocity law. Set 


(2) 2 { 1 when a is a quadratic residue mod p 
p —] when a is a nonresidue mod p. 


If p, q are distinct odd primes, show that 


(eve 


(25) Show that 3 is a quadratic residue mod p iff p = + 1 (mod 12). 


82. Quadratic-residue codes have been extensively studied by Assmus and 
Mattson in a long series of reports and papers [34], [37-39, 41-47, 49-53, 927] ` 
and a great deal of this chapter is based on their work. An alternative method 
of defining and generalizing QR codes has recently been given by Ward 
[1387]. His approach definitely merits further study. 

It is possible to use (3) to define QR codes over any field GF(q), where q is 
not necessarily prime, in which case most of the properties given in this 
chapter still hold. QR codes over GF(4) were mentioned briefly in Problem 15. 
Since the cases q —2 and 3 are the most important we have restricted 
ourselves to q = l= prime. But there are some interesting QR codes over 
GF(4)-see Assmus and Mattson [42,44]. One example is the [6, 3,4] code 
over GF(4) given in Example E9 of Ch. 19, $1. The group of this code is three 
times the alternating group on 6 letters. With this additional exception, 
Theorem 13 still holds even if | is allowed to be a prime power. See also Stein 
and Bhargava [1274]. Another example is a [30, 15, 12] code over GF(4) - see 
[41]. The weight distribution of the latter is found in Ch. 19, and hence from 
Corollary 30 of Ch. 6 gives 5-designs with parameters 


5-(30, 12, 220), 5-(30,14, 5390) and  5-(30,16, 123000), (67) 


which could be added to Fig. 16.9. For more about GF(4) codes see [1478]. 
The sources for Fig. 16.2 are Assmus and Mattson (references as above), 
Berlekamp [113, p. 360], Karlin [715,717], Karlin and MacWilliams [718], 
Mykkeltveit et al. [979], and Rosenstark [1124]. See also Pless [1052], 
Solomon [1248]. 
Van Tilborg [1325] has obtained the following strengthening of Theorem 1. 


Ch. 16. Notes 521 


Theorem 25. If 2 is a binary QR code of length p = 8m — 1, and d?—-d+1=p, 
then (i) p = 64i? + 40i + 7 = 2551 and d = 8i + 3, for some i, and (ii) there exists 
a projective plane of order d — 1. If d'-d* 1» p, then d'- d—- Mz p. 


This shows that the [47, 24] QR code has d = 11. Now d > 15 is impossible 
by the sphere-packing bound (Ch. 1), hence d = 11 or 12 by Theorem 8. But d 
must be odd by Corollary 15 of Ch. 8, so we conclude that d = 11. 

Assmus, Mattson and Sachar [49] have generalized Theorem 1 as follows. 


Theorem 26. Suppose € is an [n, n + 1), d] cyclic code over GF(q) such that 
€ 2°. If the supports of the minimum weight codewords form a 2-design 
then dó-d-l1z n. 


83. For Theorem 3 see Ribenboim [1111]. Theorem 4 is due to N. Patterson, 
unpublished. 


85. For properties of PSL.(p) see for example Conway [306], Coxeter and 
Moser [314] or Huppert [676]. Theorems 10, 12, 13 are given by Assmus and 
Mattson [41, 47,927]. See also Shaughnessy [1193, 1194]. We mention without 
giving any further details the following theorem of Assmus and Mattson 
[47, 53], and Shaughnessy [1193]: 


Theorem 27. Suppose Aut (Î)t properly contains PSL2(p). Then 

(i) Aut (2)t is isomorphic to a nonsolvable transitive permutation group on 
p letters. 

(ii) If p 22m +1223, and m is a prime, then Aut(2)t is simple and is 
isomorphic to a 5-fold transitive group. 


The proof of Theorem 13 is based on Theorem 27 and a computer search 
made by Parker and Nikolai [1023] for groups with certain properties. 


Research Problem (16.11). Extend Parker and Nikolai's search to larger values 
of p. 


Some further results about permutation groups which are applicable to QR 
codes are given in Neumann [988, 989]. 

It is worth mentioning that Kantor [714] has shown that PSL»(p) is the full 
automorphism group of (p + 1) x (p + 1) Hadamard matrices of Paley type for 
all p > 11. (See also Assmus and Mattson [39], Hall [588, 590].) Problem 2 is 
from [47]. 


886,7. Leech [803] seems to have been the first to obtain the generator matrix 
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(48) for G24. Karlin [715] studied the decomposition of QR codes into double 
circulants. For properties and tables of double circulant and quasi-cyclic 
codes see Bhargava et al. [142], Chen et al. [275], Hoffner and Reddy [659], 
Karlin [715], Kasami [730], Peterson and Weldon [1040, §8.14], Stein and 
Bhargava [1273], Tavares et al. [1311], and Townsend and Weldon [1338]. 
Another proof of Lemma 14 follows from Huppert [676, Theorem 8.4(d), p. 192]. 

The techniques for finding low weight codewords are based on Karlin 
[715], Karlin and MacWilliams [718], Mykkeltveit, Lam and McEliece [979]. 
Assmus and Mattson [44] describe another method using cyclotomy. 

That 2 is a primitive element of infinitely many fields GF(k), k = prime, is 
known as Artin’s conjecture; see Shanks [1186, p. 81]. 

For the double circulant codes & see Karlin [715,717]. The entries in Fig. 
16.7 are taken from [275,715, 717, 1273, 1311]. The weight distributions of 
some of these codes are given in [1273] and [1311]. These references also give 
some good multiple circulant codes, with generator matrices of the form 
[I|A|B|---] where A, B,...are circulants. But even less is known about 
codes of this type. See also Bhargava and Stein [141]. For circulant and other 
matrices over finite fields see Bellman [100], Berlekamp [109], Carlitz [247- 
248], Daykin [339], MacWilliams [878, 881] and Muir [974]. 


88. For QR codes over GF(3) see the papers of Assmus and Mattson. The 
Pless symmetry codes have been extensively studied by Pless [1055- 
1057, 1060]; see also Blake [160, 161]. Figure 16.9 is from Assmus and Mattson 
[41]. Theorems 17, 18 and Problem (20) are due to Pless (1055]. 


89. Permutation decoding was introduced by Prange [1078], but our treatment 
follows MacWilliams [873]. General properties of cyclic codes which can be 
permutation decoded have been studied by Shiva et al. [1201, 1202], and Yip 
et al, [1447]. 

Berlekamp (unpublished) has found a set of 40 permutations which can be 
used to decode the Golay code %3. 

Kasami [725] sketches circuits for implementing Decoding Method II, and 
gives covering polynomials for several other codes. 


Research Problem (16.12). Give a systematic method of finding covering 
polynomials. 

Berlekamp [123] also considers which error patterns of weight 4 can be 
corrected. Further papers on decoding cyclic and other codes are: Ahamed 
[8-11], Chase [264], Chien and Lum [291], Ehrich and Yau [405], Hartmann 
and Rudolph [614], Kautz [749], Mitchell [964], Miyakawa and Kaneko [966], 
Nili [994], Omura [1012, 1013], Pierce [1046], Solomon [1255], Tanaka and 
Kaneku [1299], Weldon [1404], Zierler [1462, 1463] and Zlotnikov and Kaiser 
[1473]. 








Bounds on the size of a code 


§1. Introduction 


Probably the most basic problem in coding theory is to find the largest 
code of a given length and minimum distance. Let 


A(n, d) = maximum number of codewords in any 
binary code (linear or nonlinear) of length n 
and minimum distance d between codewords. 


In this chapter upper and lower bounds are given for A(n, d). both for small n 
(see Figs. 1, 2 of Appendix A) and large n (see Fig. 17.7 below). For large n 
the best upper bound is the McEliece-Rodemich-Rumsey- Welch bound 
(Theorems 35 and 36), which is obtained via the important linear programming 
approach (see 84). This is a substantial improvement on the Elias bound, the 
old record-holder (Theorem 34). The best lower bound for large n is the 
Gilbert-Varshamov bound (Theorem 30), but there is still a considerable gap 
between the upper and lower bounds (see Research Problem 17.9). 

One way of tackling A(n,d) is by studying sets of codewords having 
constant weight. To this end we define 


A(n, d, w) = maximum number of binary vectors of 
length n, Hamming distance at least d apart, 
and constant weight w. 


Fig. 3 of Appendix A gives a table of small values of A(n, d, w). The best 
upper bounds on A(n, d, w) are the Johnson bounds (82), which are used in 33 
to obtain bounds on A(n, d). The linear programming approach, which applies 
to both large and small values of n, and to both A(n, d) and A(n, d, w), is 
described in $4. 
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Another important function is 


B(n, d) = greatest number of codewords in any 
linear binary code of length n and minimum distance d. 


Of course B(n, d)<A(n, d). An upper bound which applies specifically to 
B(n, d) is the Griesmer bound, described in §5. An interesting technique for 
constructing linear codes which sometimes meet the Griesmer bound is 
presented in §6. The construction uses anticodes, which are codes which have 
an upper bound on the distance between codewords (and may contain 
repeated codewords). An excellent table of B(n, d) covering the range n < 127 
has been given by Helgert and Stinaff [636], so is not given here. For large n 
the best bounds known for B(n, d) are the same as those for A(n, d). 

Instead of asking for the largest code of given length and minimum 
distance, one could look for the code of shortest length having M codewords 
and minimum distance d. Call this shortest length N(M, d). 


Problem. (1) Show that the solution to either problem gives the solution to the 
other. More precisely, show that if the values of A(n, d) are known then so 
are the values of N(M, d) and vice versa. State a similar result for linear 
codes. 


Most of this chapter can be read independently of the others. The follow- 
ing papers dealing with bounds are not mentioned elsewhere in the chapter, 
but are included for completeness: Bambah et al. [64], Bartow [74, 75], 
Bassalygo [76], Berger [105], Chakravarti [259], Freiman [457], Hatcher [626], 
Joshi [700], Levenshtein [823,824], Levy [830], MacDonald [868], McEliece 
and Rumsey [948], Myravaagnes [980], Peterson [1037], Sacks [1140], Sidel- 
nikov [1209, 1210], Strait [1283] and Welch et al. [1399]. 


82. Bounds on A(n, d, w) 


We begin by studying A(n, d, w), the maximum number of binary vectors 
of length n, distance at least d apart, and constant weight w. This is an 
important function in its own right, and gives rise to bounds on A(n, d) via 
Theorems 13 and 33. In this section we give Johnson's bounds (Theorems 2-5) 
on A(n, d, w), and then a few theorems giving the exact value in some special 
cases. A table of small values of A(n, d, w) is given in Fig. 3 of Appendix A. 
(Another bound on A(n, d, w), using linear programming, is described in 84.) 
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Theorem 1. 
(a) A(n, 28 — 1, w) = A(n, 26, w). 
(b) A(n, 28, w) = A(n, 26, n — w). 
(c) A(n,28,w)=1 ifw< ô. 
(d) A(n, 26, 6) = [3]. 


Proof (a) follows because the distance between vectors of equal weight is 
even. To get (b) take complements. (c) is obvious. (d) follows because the 
codewords must have disjoint sets of 1’s. Q.E.D. 


The next result generalizes (d): 
Theorem 2. (Johnson [693]; see also [694—697].) 


ón 
A(n, 26, s P (1) 


provided the denominator is positive. 


Proof. Let € be an (n, M, 26) constant weight code which attains the bound 
A(n, 26, w); thus M = A(n,26, w). Let X = (xj) be an M X n array of the 
codewords of €; each row has weight w. Evaluate in two ways the sum of the 
inner products of the rows: 


DAE 


xu 


>> XiXjv 


Since the distance between any two rows is at least 26, their inner product is 
at most w — 6. Hence the sum is <(w — 6)M(M - 1). 
On the other hand, the sum is also equal to 
n M M 
$ $ $ 15 
v-li-lj-1 
j*i 


th 


If k, is the number of 1’s in the v^ column of X, this column contributes 


k,(k, — 1) to the sum. Hence 
È klk - 1) Qv - 8)M(M - 1), Q) 
But 
> k,=wM 


(the total number of 1’s in X), and X7., ki is minimized if all k, = wM/n, in 
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which case 


n Q wM 
nece 


"E 
Therefore from (2) 
w°M? 
n 


Solving for M gives (1). Q.E.D. 


-wM «(w -8)M(M- 1). 


Since the k,'s must be integers this can be strengthened slightly. 


Theorem 3. Suppose A(n, 26, w) = M, and define k and t by 
wM=nk+t, OSt<n. 
(This is the total number of V's in all codewords.) Then 


nk(k — 1) -2kt <(w — 6)M(M - 1). (3) 


Proof. The minimum of X7., k2, subject to 


n 


k, =wM 
=i 


v 


and the k,'s being integers, is attained when ki=--- =k, -ktl, kac 
k, = k. This minimum is 

t(k - 1 * (n — DK), 
so from (2) 


t(k 1 (n — Dk - (nk t)&(w-6)M(M - D. Q.E.D. 


Examples. From Theorem 2 


do pm 
Ap lie -36 mi n 


and the code (111100000, 100011100,010010011} shows that A(9,6, 4) =3. 
Again from Theorem 2, 


24 i 
48.60 [5x] = 


But if A(8, 6, 4) 2 3, then applying Theorem 3 we find 
3.4 — 1.8 +4, 
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so k=1, t=4 and Equation (3) is violated. Hence A(8,6,4) <2. The code 
(11110000, 10001110} shows that A(8, 6, 4) =2 

The following recurrence is useful in case Theorem 2 doesn’t apply (and in 
some cases when it does). 


Theorem 4. 


A(n, 28, w= [2 A(n — 1,28, w — n]. (4) 


Proof. Consider the codewords which have a 1 in the i coordinate. If this 
coordinate is deleted, a code is obtained with length n — 1, distance = 26, and 
constant weight w — 1. Therefore the number of such codewords is « A(n — 
1, 26, w — 1). The total number of 1’s in the original code thus satisfies 


wA(n, 26, w) « nA(n — 1,26, w — 1). Q.E.D. 

Corollary 5. 
axis e e fe n m 
Proof. Iterate Theorem 4 and then use Theorem 1(d). Q.E.D. 


In practice Theorem 4 is applied repeatedly until a known valde of 
A(n, d, w) is reached. 


Example. 


A(20, 8,7) € [7 A(19, 8, 6)| from (4), 


= [2 E A(18, 8, »J] from (4). 
7l6 
Now Theorem 2 applies: 


72 72 
A Sms [5—7]- E |- I9 


But A(18,8,5) = 10 is impossible from Theorem 3, since 50 = 18: 2 14 but 
18-2+2-2-14>(5—4)- 10-9. Hence A(18, 8. 5) x 9, and so 


A(20, 8, »«[ [AA || = [2735] - 80. 


In fact equality holds in both of these (see Fig. 3 of Appendix A). 
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Problem. (2) Show 
A(n, 28, w) < Fes A(n — 1,28, ». (6 


[Hint: use Theorem 1(b).] 


Sometimes (6) gives better results than (4). 


Research Problem (17.1). Given n, 26, w, in which order should (4), (6) and (1) 
be applied to get the smallest bound on A(n, 26, w)? 


Problems. (3) Some easy values. Show 


A(10, 6, 3) = 3, A(10, 6, 4) = 5, A(10, 6, 5) = 6, 
A(11, 6, 4) = 6, A(11, 6, 5) = 11, 
A(12, 6, 4) = 9. 
(4) Assuming a 4r x 4r Hadamard matrix exists, show that 
A(4r — 2, 2r, 2r - 1) = 2r, (7) 
A(4r—-1,2r,2r—1)=4r-1, (8) 
A(4r, 2r, 2r) = 8r - 2. (9) 


(Hint: use the codes £n, Cn, An of page 49 of Ch. 2.] 
(5) Show 


A(n + 1, 28, w) « A(n, 26, w) + A(n, 26, w — 1). 
(6) Show that 


(n-10D):::(n-w-ó) 


A(n,28, w) «^ ans (10) 


with equality iff a Steiner system S(w —6+1,w,n) exists (cf. 85 of Ch. 2). 


Thus known facts about Steiner systems (see Chen [277], Collens [300], 
Dembowski [370], Denniston [372], Doyen and Rosa [386], Hanani [595-600], 
Di Paola et al. [1020], Witt [1423, 1424]) imply corresponding results about 
A(n, d, w). For example, the Steiner systems S(3,q +1, q°+ 1) exist for any 
prime power q [370, Ch. 6], and so 


A(q’°+1,2q-2,q+1)=4(q’°+1), q = prime power. (11) 
Similarly, from unitals [370, p. 104], 
A(q'-1.243, 4-1) qq? -q* D, q = prime power. (12) 
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Problem 6 also shows that the determination of A(n, 26, w) is in general a 
very difficult problem. For example A(111,20,11)<111 from (10), with 
equality iff there exists a projective plane of order 10. The following results 
are quoted without proof. 


Theorem 6. (Schénheim [1158]; see also Fort and Hedlund [446], Schénheim 
[1161], Spencer [1259] and Swift [1294].) 





[; [: 2 al for n# 5(mod 6), 
A(n, 4, 3) = 
[3 Ez] -1 forn-S(mod9. 


Theorem 7. (Kalbfleisch and Stanton [708], Brouwer [201a].) 


o De if n =2 or 4(mod 6), 
A(n, 4,4) - ee if n=1 or 3(mod 6). 

n(n-3n-6) .,. 

SKAD if n =0(mod 6) 


Research Problem (17.2). Find A(n,4,4) in the remaining cases of Theorem 7. 


Theorem 8. (Hanani [595-598].) 
A(n, 6, 4) — EID iff n=1 or 4(mod 12), 


n(n — 1) 


A(n,8,5) = 


iff n 51 or 5(mod 20). 


The proofs of Theorems 6-8 give explicit constructions for codes attaining 
these bounds. 


Theorem 9. (Erdós and Hanani [411], Wilson [1420-1422].) 
(a) For each fixed k 22, there is an n.(k) such that for all n > n,(k), 


n(n — 1) 
k(k — 1) 





A(n,2k - 2, k) = iff n 5 1 or k(mod k(k — 1)). 
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Also 


. K(k ~1) m ze 
lim nni) A(n, 2k -2,k)- 1. 


(b) If p is a prime power then 


. (p+bDp(p-1) 
lim im — n2) 


(See also the papers of Alltop [21-24] and Rokowska [1122].) 


A(n,2p—2,p+1)=1. 


But even when none of the above theorems apply, there is still a good 
method for getting lower bounds on A(n, 26, w): we find a code of length n 
and minimum distance 26, and use the set of all codewords of weight w. E.g. 
using the 2576 codewords of weight 12 in the [24, 12, 8] Golay code 9, shows 
A(24.8, 12) z 2576. (Actually equality holds.) 

More generally, one can use the vectors of weight w in any coset of a code 
of length n and distance 26. E.g. using the cosets of 4. given in Problem 13 of 
Ch. 2 shows A(24,8,6) 2 77, A(24,8, 10) = 960, etc. 

Constant weight codes have a number of practical applications - see Cover 
[310]. Freiman [456], Gilbert [483], Hsiao [666], Kautz and Elspas [752], 
Maracle and Wolverton [911], Nakamura et al. [983], Neumann [986, 987], 
Schwartz [1169] and Tang and Liu [1307]. But little is known in general about: 


Research Problem (17.3). Find good methods of encoding and decoding 
constant weight codes (cf. Cover [310]). 


Problem. (7) Equations (4) and (1) imply A(12,6,5)<[12A(11, 6, 4)/5] = 14. 
Show that A(12, 6, 5) = 12. 
[Hint. (i) By trial and error find an array showing that A(12, 6, 5) = 12. 
(ii) In any array realizing A(12,6, 5), the number of 1’s in a column is 
< A(11,6, 4) - 6. 
(iii) The number of pairs of 1's in any two columns is = A(10,6, 3) = 3. 
(iv) Suppose columns 2, 3 contain 3 11's. Without loss of generality the first 
3 rows are 


011111000000 
011000111000 
011000000111 


Let b, c, d, e be the number of rows beginning 110, 101, 100, 000 respectively. 
Then b+c+d+e29, bx3, c3, b+ct+d<6 hence ez3. But es 
A(9, 6, 5) = 3, hence e =3 and there are = 12 rows. 

(v) Suppose no pair of columns contain 3 11’s. Then the column sums are « 5 
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and again there are « 12 rows. For if column 1 has weight 2 6, look at the rows 


with 1 in column 1, and show that some column has inner product =3 with 
column 1.] 


Problem. (8) (Hard.) Show that A(13, 6, 5) = 18. 


Some other papers dealing with A(n, d, w) are Niven [996] and Stanton et 
al. [1270]. 


83. Bounds on A(n, d) 


The bounds on A(n, d, w) obtained in $2 can now be used to bound A(n, d). 
First we quote some results established in $82, 3 of Ch. 2. Theorem 11 gives 
A(n, d) exactly if n x 2d. 


Theorem 10. 
(a) A(n,28) = A(n — 1,28 — 1) 
(b) A(n, d) € 2A(n — 1, d) 


Theorem 11. (Plotkin and Levenshtein.) Provided suitable Hadumard matrices 
exist: 


(a) 
A(n, 28) - 2 [3] if 487n228 
UU "l48-n ew 
A(46, 28) = 86. 
(b) 
26+2 


A(n, 28 + 1) =2[ | if46+3>n226+4+1, 


46 *3—-n 
A(46 + 3,26 -1) 86 +8. 


The following result, proved in Theorem 6 of Ch. 1. is useful if d is smali. 
Theorem 12. (The sphere-packing or Hamming bound.) 


A(n, 20 «n (14 (D) (2) (13) 


This theorem can be strengthened by introducing the function A(n, d, w). 
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Theorem 13. (Johnson.) 


1 


(3) n (s À i} z an ) Mee ee net ele Si 
[zu] 


Proof. (i) Let € be an (n, M, d) code with M = A(n,d), d =26+1, which 
contains the 0 vector. Let S; be the set of vectors at distance i from €, Le. 


amasan ie (oe 





(14) 


S, = {u € F": dist(u, v) > i forall v € €, and 
dist(u, v) = i for some v € €). 
Thus S,- €. Then 
SU S Ute U Sa-i = F”, 


for if there were a vector at distance = d from € we could add it to € and get 
a larger code. The sphere-packing bound (Theorem 12) follows because 
So, ..., Ss are disjoint. To obtain (14) we estimate S;.;. 

(ii) Pick an arbitrary codeword P and move it to the origin. The codewords 
of weight d= 26 +1 form a constant weight code with distance = 26 +2, i.e. 
the number of codewords of weight d is < A(n, 268 * 2,26 + 1). 

(iii) Let Ws+, be the set of vectors in F” of weight 6 +1. Any vector in W5,, 
belongs to either S; or S;... Corresponding to each codeword Q of weight d 
there are () vectors of weight 8 + 1 at distance 8 from Q. These vectors are in 
W410 Ss, and are all distinct. Therefore 


[Ws. n Ss. i| = [Ws] - | We N S;| 


n d 
> (s K )- 6G) A(n, 28 +2, 28 +1). 

(iv) A vector R in Ws.,9 S; is at distance ô +1 from at most [n/(6 + 1)] 
codewords. For move the origin to R and consider how many codewords can 
be at distance 6 + 1 from R and have mutual distance d (really d + 1 = 26 +2). 
Such codewords must have disjoint sets of 1’s, so their number is <[n/(é + 


D]. 

(v) Now let P vary over all the codewords, and count the points in 
So U (eU Ss. (14) follows. Q.E.D. 
Example. 


2" 
1 (9) 4 (3) + (8) - 10A(12, 6, 5)}/4 


But A(12, 6, 5) = 12 from Problem 7. Therefore A(12, 5) « 39. For comparison 


A(12, 5) < 
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the sphere-packing bound only gives A(12,5)<51. We shall see in the next 
section that in fact A(12, 5) = 32. 


Corollary 14. 











A(n, 28 +1) (Dou 





Proof. From Corollary 5, 





A(n,28 + 2,28 + oem si] 


(26 + 1)28 --- (8+2) L6+1]) 


Substituting this in (14) gives (15). Q.E.D. 


If equality holds in (13) the code is called perfect (see Ch. 6). A cade for 
which equality holds in (15) is called nearly perfect (Goethals & Snover [504]). 


Problems. (9) (a) Show that a perfect code is nearly perfect. 

(b) Show that a nearly perfect code is quasi-perfect. 

(10) Show that the [2 — 2,2 — r - 2,3] shortened Hamming code, and the 
punctured Preparata code 9 (m)* (p. 471 of Ch. 15) are nearly perfect. Hence 
A(Z — 2,3) 2 2? and AQ"—- 1,5) = 27" for even m = 4. This shows that 
the Preparata code has the greatest possible number of codewords for this 
length and distance. 


It has recently been shown by K. Lindstróm [844] that there are no other 
nearly perfect binary codes. The proof is very similar to that of Theorem 33 
of Ch. 6. 


Problems. (11) For even n show that A(n, 3) « 2"/(n * 2) [Hint: Combine 
Theorems 6 and 13. For n = 4(mod 6) one can say a bit more.] 
(12) (Johnson.) Prove the following refinement of (14): 


n p n Cs. Cs. } n 
BUR Oe pfi a ()* ESTORT D'Am2842,842] ^" 
(16) 
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where 


26+1 


e 
conn (3) 


) A(n, 26 + 2,26 +1), 


26 + 
ô- 


Cs 
G- 
eus 
) 


ee 28 + 2,28 41) 


= 
C 


EAS 
841 


~!) A(n, 28 +2,28 +1) 
A(n, 28 +2, 26 +2) 
Jae. 28 +2, 28 +3). 


[Hint: Any vector at distance 8 +2 from a codeword belongs to one of S5-1, 
Ss, Ss. OF S5+2-] 

(13) Generalize (16). 

(14) Show that if n «2d then A(n,d) is either 1 or an even number, 
provided suitable Hadamard matrices exist. [Hint: Theorem 11.] Use Fig. 1 of 
Appendix A to show that if A(n, d) is odd and greater than 1 then A(n, d)= 
37. 


Research Problem (17.4). Is it true that if A(n, d) > 1 then A(n, d) is even, for 
all n and d? (Elspas [409].) 

Just as the bound for A(n, 28 + 1) given in (14) depends on A(n, 28 + 2,26 + 
1), so it is possible to give bounds for A(n, d, w) which depend on a function 
denoted by T(w,, ni, w2, M2, d). These bounds are rather complicated and we 
do not give them here (see [697]). But the function T is of independent 
interest. 


Definition. T(wi, ni, wz, M2, d) is the maximum number of binary vectors of 
length n; n; having Hamming distance at least d apart, where each vector 
has exactly w, 1’s in the first n; coordinates and exactly w; 1’s in the last nm 
coordinates. 


Example. T(1,3, 2,4, 6) = 2, as illustrated by the vectors 


(the 1’s must be disjoint!). 
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Problem. (15) Prove: 
(a) T(wi, ni, Wa, nj, 28) = T(ni— Wy, ni, W2, M2, 26). 
(b) If 2w,+2w, = 268 then 


T(wi, ni, Wz, 15, 28) = min [z], [5]. 
Wi W2 














(c) T (wi, ni Wa, Na, 28) © [2 TOS Lote dos n,25)]. 
H 
(d) T(ww ni, Wa, n», 2«| m T(wi, ni — 1, Wz, No, 20). 
ni-Wi 
6 
(e) T(wi, ni, W2, hna; 28) <= ae ivi 
i + 2 + ô Wi Wa 
ny na 


provided the denominator is positive. 
(f) T(0, ni, w2, n2, 26) = Ang, 26, w2). 
(g) T(wi;, ni, Wa, n», 26) * A(n;, 26 M 2w;, w2). 
(h) (A generalization of Theorem 3). Suppose T( wi, ni, w2, n2, 28) = M, and 
for i = 1, 2 write Mw: = qini tr, OS r; « n. Then 
M7(wi + w) = r(qitl!-c(ni-r)qi 
+r(q2 + 1)? + (m2 — r)qi + 6M(M - 1). 
(i) 
TCL, 9, 4, 9, 8) = 3, TCI, 9, 4, 9, 10) = 2, 
T(1, 9, 5, 13, 10) = 3, TC, 9, 6, 14, 10) = 7, 
TC, 8,7, 15,10) < 11, TC, 9,7, 15, 10) = 12. 


Tables of T( wi, ni, w2, r3, d) are given by Best et al. [140]. 


$4. Linear programming bounds 


It is sometimes possible to use linear programming to obtain excellent 
bounds on the size of a code with a given distance distribution. This section 
begins with a brief general treatment of linear programming. Then the 
applications are given, first to arbitrary binary codes, and then to constant 
weight codes. Asymptotic results, valid when n is large, are postponed to $7. 


Linear programming (see for example Simonnard [1211] or Ficken [428]). 
This is a technique for maximizing (or minimizing) a linear form, called the 
objective function, subject to certain linear constraints. A typical problem is 
the following: 
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Problem. (D The primal linear programming problem. Choose the real 
variables x,,...,X, so as to maximize the objective function 


> ox (17) 
i=i 
subject to the inequalities 
x20, j-L...,s, (18) 
Y€axz-b, i-L...,n (19) 
j=t 


Associated with this is the dual problem, which has as many variables as 
the primal problem has constraints, and as many constraints as there are 
variables in the primal problem. 


Problem. (II) The dual linear programming problem. Choose the real variables 
Hs... Un SO as to 


n 


minimize > ub; (20) 
i=i 
subject to the inequalities 
u=0, i—l,...,n, (21) 
Dd ua; s - c, Jed. s (22) 
izi 


It is convenient to restate these in matrix notation. 


(D' Maximize cx” subject to x 7 0, Ax? zm- b’, 
(ID' Minimize ub” subject to u > 0, uÁ x — c. 


A vector x is called a feasible solution to (I) or (IY if it satisfies the 
inequalities, and an optimal solution if it also maximizes cx’, Similarly for 
(II. The next three theorems give the basic facts that we shall need. 


Theorem 15. If x and u are feasible solutions to (I) and (II) respectively then 


cx? s ub”. 
Proof. 
s 
5 aX; 2 — bi and ui = 0 
j=i 


together imply 


s 
u; 9, ax; >~ ub; or uAx? >- ub”. 
j=l 
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Similarly 
n 


SX uiay<—c, and x, 20 


f=} 


imply uÁx? « — cx”. Then cx’ s- uAx?! sub”. Q.E.D. 


Theorem 16, (The duality theorem.) Let x and u be feasible solutions to (I) and 
(II) respectively. Then x and u are both optimal iff cx! = ub’. 


Proof. (€) Suppose cx" = ub" but x is not optimal. Then there is a feasible 
solution y to (D such that cy" >cx’ — ub?, contradicting the previous 
theorem. The second half of the proof is omitted (see [1211]). Q.E.D. 


Theorem 17. (The theorem of complementary slackness.) Let x and u be 
feasible solutions to (I) and (II) respectively. Then x and u are both optimal iff 
for every j ^ 1,...,s, either 


n 
X; = 0 or > udi = — Cj; 
i-1 
and for every i=1,...,n, either 


s 
Hu; = 0 or > AX; = — b. 
ILE! 


In words, if a primal constraint is not met with equality, the corresponding 
dual variable must be zero, and vice versa. 


The proof may be left to the reader. 


Applications to codes (Delsarte [350—352]). Let € be an (n, M, d) binary code 
in which the distances between codewords are ro 0 ; & 4 €: <T, Let 
{B;} be the distance distribution of €, i.e. B, is the average number of 
codewords at distance i from a fixed codeword (81 of Ch. 2). Thus Bo = 1, 
B, 20 (j=1,...,5) and B; 2 0 otherwise. Also M = 1+ Xi.i B,. The trans- 
formed distribution {Bj} is given by (Theorem 5 of Ch. 5) 


> BP), k-0,...,n, (23) 


where P,(x) is a Krawtchouk polynomial (§2 of Ch. 5). Also Bo=1 and 
Zk- By = 2"/M. We now make good use of Theorem 6 of Ch. 5 (or Theorem 
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12 of Ch. 21), which states that 


470 for k=0,...,n. (24) 
Thus if € is any code with distances {7;}, j=0, ..., 5, between codewords, 
then B,,,...,B,, is a feasible solution to the following linear programming 
problem. 
Problem. (III) Choose B,,,..., B,, so as to 
maximize > B,, 
j=l 
subject to 
B, 20, j-L...,s, (25) 
X BG x-(D. k= deem (26) 


Therefore we have the following theorem. 


Theorem 18. (The first version of the linear programming bound for codes.) If 
B*,..., B*, is an optimal solution to (III), then 1+ Xi., B*, is an upper 
bound on the size of ©. 


Note that (III) certainly has a feasible solution: B,, = O for all i. 
It is often possible to invent other constraints that the B,,’s must satisfy, 
besides (25) and (26). We illustrate by an example. 


Example. The Nadler code is optimal. Let us find the largest possible double- 
error-correcting code of length 12. Adding an overall parity check to such a 
code gives a code €€ of length 13, minimum distance 6, in which the only 
nonzero distances are 6, 8, 10 and 12; suppose € has M codewords. Let Ai(u) 
be the number of codewords in € at distance i from the codeword u € €. The 
distance distribution of € is 


1 ; 
B, = È, Adu), §=0....,13. 


Then Bo= 1, and the B;'s are zero except (possibly) for Be, Bs, Bio and By. 
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The inequalities (26) become 


Be— 3B,— 7TBy- 11Bi 7-13, 


-6B.— 2Bs+18By+ 54Bu2-— e 
13 
— 6 Bs + 14B: — 14B,.— 154B, 7 — 3 ) (27) 


15Be— 5B,-25B. 4 275Bs 2 — ve 


15 Be — 25B, + 63By — 297 Bu» — ee: 


~ 20B, + 20B: — 36B + 132Bi > — e) 


(There are only 6 distinct inequalities - see Problem 43 of Ch. 5.) 
Another inequality may ibe found. Clearly A,.(u)<1 (for taking u = 0 we 
see that the number of v € € of weight 12 is at most 1). Also 


A,(u) € A(13, 6, 10) 
= A(13,6,3) by Theorem 1, 
=4 from Fig. 3 of Appendix A. 


(It is easy to prove this directly.) Finally, if Ai(u)=1 then Ai(u)=90. So 
certainly 


Aw(u)+4Apn(u) <4 for all uc €. 
Summing on u € € gives the new inequality 
Bi * 4B, x4. (28) 


Actually (28) and the first two constraints of (27) turn out to be enough, and 
so we consider the problem: 


Maximize Be + Bst+ Bio * Bi 

subject to 
B«z0, B,7 0, B,,z0, B,z0 

and 

B«—-3Bs— 7Bio— 11Biz >- 13, 

— 6B, - 2B, + 188, + 5ABu 7 — 78, (29) 
— Bo- ABoz-A. 

The dual problem is: 


Minimize 13u,+78u.+4u,; 
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subject to 
u,=0, u;z0, u,20 
and 
u,— 6u; <-l, 
—3u,- 2u; x -—], (30) 
— 7u, + 18u, — u, S 1, 
—1lu,+54u,-4u, <- 1. 
Feasible solutions to these two problems are 
Be = 24, B, = 3, By = 4, B,,=0, (31) 
U= U2 = 3, u; = ¥. (32) 


In fact since the corresponding objective functions are equal: 
24434+44+0=13°5+78-°54+4-¥=31, 


it follows from Theorem 16 that (31) and (32) are optimal solutions. (These 
solutions are easily obtained by hand using the simplex method - see [428] or 
[1211].) 

The following argument shows that (31) is the unique optimal solution. Let 
Xe, Xs, X10, X1; be any optimal solution to the primal problem. The u: of (32) are 
all positive and satisfy the first three constraints of (30) with equality but not 
the fourth. Hence from Theorem 17, the x; must satisfy the primal constraints 
(29) with equality, and x;; = 0. These equations have the unique solution 


Xe = 24, Xs = 3, X= 4. 


Thus (31) is the unique optimal solution. 

The other constraints in (27) are also satisfied by (31), hence from Theorem 
18, € has at most 32 codewords. So we have proved that any double-error- 
correcting code of length 12 has at most 32 codewords. This bound is attained 
by the Nadler code of Ch. 2, $8. 

Note that this method has told us the distance distribution of €. Also, since 
the first two of the constraints (27) are met with equality, in the transformed 
distribution B{ = B} = 0, and d' = 3. But there are only 3 nonzero distances in 
€. Hence by Theorem 6 of Ch. 6, € is distance invariant, and A;(u) = B; for 
all u € €. (These remarks apply only to the extended code €, not to the 
original code of length 12.) 

If the bound obtained by the linear programming method is an odd number, 
say |€|s b, b odd, then the bound can sometimes be reduced by the following 
argument (see Best et al. [140]). For suppose |€| = b, then from (36) of Ch. 5, 
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bB; = È B;P.() 
1 2 
LE (gery. 
wt (ujak Nve€ 
The inner sum contains an odd number of + 1’s, hence is nonzero. Therefore 
bB,= (1/b)(@), i.e. we can replace (26) by the stronger condition 
: 1\/n 
> BPm - (1-5) (7). (3) 
A b Jk 


The following example shows how this is used. 


Example. A(8,3) = 20. Let € be a largest single-error-correcting code of 
length 8, and let € be the extended code of length 9 and distance 4 with 
distance distribution Bo = 1, Ba, Be, Bs. The conditions (26) are: 
B,-3B.- 7B,-— 9, 
— 4B, +20B,>— 36, (34) 
-4B,*8B.—-28B,7 — 84, 
6B, — 6B« + 14B: > — 126, 
and we can also impose the conditions 
B.<1, 
Bs + 4Bs € 12, (35) 


which are proved in the same way as (28). By linear programming we find that 
the largest value of Bst+ Be+ B, subject to (34) and (35) is 20i, so |€| x 21. 
Suppose |€| = 21. Then (33) applies, and the RHS's of (34) can be multiplied 
by 20/21. Linear programming now gives B.+ Be + Bs x 19.619..., so |@| 20. 
Since an (8, 20, 3) code was given in Ch. 2, it follows that A(8, 3) = 20. Many 
other bounds in Fig. 1 of Appendix A were obtained in this way. 

We now return to the general case and consider the dual problem to (III). 
This will enable us to obtain analytical bounds. The dual to (III) is: 


Problem. (IV) Choose B;,...,f, so as to 


minimize Y, (t) (36) 
£z k 
subject to 
B.20, k=1,...,n, (37) 
D BPi(t)<-1, j=l... s. (38) 


k=l 


The advantage of using the dual formulation is that any feasible solution to 
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(IV) is, by Theorem 15, an upper bound to the size of the code, whereas only 


the optimal solution to (III) gives an upper bound. The easiest way to specify 
a feasible solution to (IV) is in terms of a polynomial we shall call B(x). 


Lemma 19. Let 
BG) = 1+ $ BP). 


Then B,,..., Bn is a feasible solution to (IV) iff B, 20 for k =1,...,n, and 
B(T) <0 forj=1,...,5. 


Proof. 
BG) = 1+ 9 &P.G). Q.E.D. 


From the preceding remark we have immediately: 


Theorem 20. (The second version of the linear programming bound for codes.) 
(Delsarte [350].) Suppose a polynomial B(x) of degree at most n can be found 
with the following properties. If the Krawtchouk expansion (p. 168 of Ch. 6) of 
B(x) is 


n 


B(x) = 2 B.P.(x), (39) 
then B(x) should satisfy 
Bo 7 1, (40) 
B. 70 fork =1,...,n, (41) 
B(r) <0 forj=1,...,s. (42) 


Then if € is any code of length n and distances (rj), j 9 1,...,5s, between 
codewords, 


[€] « 8(0 = +5 B. (D (43) 


Corollary 21. If 
B(x) = BPa) 


satisfies (i) Bo= 1, B, 20 for k=1,...,n, and (ii) BG) <0 for j=d, d+ 
1,...,n, then 


A(n, d) « B(0). (44) 
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From Theorem 17, any code which meets (43) with equality must satisfy 
certain conditions: 


Theorem 22. Let B,,..., B,, and Bi,..., Bn be feasible solutions to (IIT) and 
(IV) respectively. They are optimal solutions iff 

(i) B.BL=0 forlsk<n, 
and 

(ii) BG)B,-20 forj-1,...,s. 


Examples. (1) We begin with a very simple example, and show that the dual of 
a Hamming code is optimal. That is, we use Corollary 21 to show that a code 
€ of length 2" — 1 and minimum distance 2"^' can have at most 2" code- 
words. (Of course this also follows from Theorem 11a.) We choose 


B(x) = 2(2""' - x) 
= P(x) + P (x), 
thus o= Bi=1, B, =0 for k > 1. Since 
BG)<0 for j=2""', 27-'4+1,...,27-1, 
the hypotheses of Corollary 21 are satisfied, and so from (44) 
|€| = g(0) = 2". 


Thus the dual of a Hamming code is optimal. 

Furthermore, from Theorem 22(ii), B; = 0 in € for i> 2" *; thus € has just 
one non-zero distance, 2". 

(2) The Hadamard code €,, of Ch. 2 with parameters (4m, 8m, 2m) is 
optimal: A(4m, 2m) = 8m. (Another special case of Theorem 11a.) To show 
this, use Corollary 21 with 


B(x) == Qm — x)(4m -x) 
= Po(x)+ Pi(x)+ = P(x). 


(3) Let us find a bound on the size of a code with minimum distance d, 
using a B(x) which is linear, say B(x) = 1+ B,Pi(x)= 1+ Bi(n 2x), where 
p:>0. We want B(d), B(d+1),...,B(n) <0 and B(0)=1+ Bin as small as 
possible. The best choice is to set B(d)=0, i.e. B, = 1/(2d — n). Then (44) 
gives |6| = B(0) = 2d/(2d — n). This is a weaker version of the Plotkin bound, 
Theorem 11. 


Problem. (16) (Delsarte.) Obtain the sphere packing bound from Corollary 21. 
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[Hint: Let d=2e+1. Take 


B= {ray /> (Sl. Oskan, 


where L.(x) is the Lloyd polynomial of Ch. 6, Theorem 32. Use Corollary 18 of Ch. 5 
to show 


Bü)-0 for 2e+1<i<n, po -7/2 (7). 


Example. (4) The Singleton bound (Delsarte). One way of satisfying the 


hypotheses of Corollary 21 is to make B(j) 2 0 for j 7 d,...,n. Thus we take 
B(x) ay an-dti II ( EN z). 
fea 


The coefficients B, are given by (Theorem 20 of Ch. 5) 


B. = 5: È BUPA) 
"332, 1i) P9/G* i) 


-G-0/G* 3 


by Problem 41 of Ch. 5. Note B, =0 for n—d+2<k <n. Therefore: 


if € is any (n, M, d) binary code, 
M = B0) = 27t, (45) 


This is a generalization of the Singleton bound (Ch. 1, Theorem 11) to 
nonlinear codes. 


Problems. (17) Show that if € is an (n, M, d) code over GF(q), M«q"**.. 

(18) (Delsarte.) Let € be an (n, M, d) binary code with the property that if 
u € € then the complementary vector à is also in €. (Example: any linear 
code containing 1.) Prove that 


8d(n — d) 


n —(n -2dy" (46) 


Ms 
provided the denominator is positive. Equation (46) is known as the Grey- 
Rankin bound, after Grey [561] and Rankin [1090, 1091]. [Hint: Form an 
(n, 3M, d) code €' by taking one codeword of € from each complementary 
pair. The distances in €' are in the range [d, n — d]. Use Corollary 18 of Ch. 5 
B(x) = a(d —x)(n — d - x) where a is a suitable constant.] 
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Note that the (n — 1, 2n, Xn —2)) conference matrix codes of Ch. 2, §4 
satisfy (46) with equality, yet are not closed under complementation. See 
Research Problem 2.2. 

(19) Show that every feasible solution to Problem (III) satisfies B, = (7). 

(20) (McEliece [944].) Another version of Corollary 21: if 


ó(x) x x > 5.Pi.(x) 


satisfies (i) 6(0)=1, 6()) 20 for j=1,...,n, and (ii) & <0 for k=d, d+ 
1,..., n, then 


A(n, d) € & = > (*) 8(j). (47) 


(Hint: Set B, = 6(k) and use Corollary 18 of Ch. 5.] 

(21) ((944].) If n is even, choose 6(x) to be a quadratic with 6(0)=1, 
5(n/2 + 1) 7 0, and hence rederive Problem 11. 

(22) ([944]. If n 2 I(mod 4), choose 8(x) = (x — ay - 3- iP.(x)y(a? — 1), 
where a=(n+1)/2, and show A(n,3)<2"/(n +3). In particular, show 
AQ"—-3,3)227 "5? 


Research Problem (17.5). Give a combinatorial proof that AQ" —3,3)— 


go" m-3 i 


The linear programming bound for constant weight codes. (Delsarte [352].) 
There is also a linear programming bound for constant weight codes. Let D be 
a binary code of length n, distance 28, and constant weight w. Then |Z|'« 
A(n, 26, w). Let (B5), i=0,1,...,w, be the distance distribution of 2. It 
follows from the theory of association schemes (see Theorem 12 of Ch. 21) 
that the transformed quantities Bi are nonnegative, where now 


Bu =, > BaQ), k=0,...,w, (48) 
|D| S 


and the coefficients Q,(i) are given by 


Q6) = 7*1 gd (D / C0 ; "). (49) 


Em-xco( NM" em) 


is an Eberlein polynomial (see §6 of Ch. 21). 

So we can get bounds on A(n, 26, w) by linear programming: maximize 
Bo+ B+- -+ Bs, subject to Ba =0 and Bs, =0 for all i and k. As before it 
is often possible to impose additional constraints. 


and 
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Example. A(24, 10, 8) = 69. When n = 24, 28 = 10 and w =8 the variables Bu, 
Bi, Bis, Bi, must satisfy an additional constraint, namely Bi, = A(16, 10, 8) = 
4 (from Appendix A). (For Bis is bounded by the number of vectors of weight 
8 which are disjoint from a fixed vector of weight 8 and have mutual distance 
at least 10 apart). By linear programming the maximum of Bio + Bi; + But Bis 
subject to these constraints is 68, and is attained when 


Bio = 56, B..=0, By= 8, Bi 4. 
Thus A(24, 10,8) = 69. 


Research Problem (17.6). Is there a code with this distance distribution? The 
answer is no. (R. E. Kibler, written communication.) 


The entries in Fig. 3 of Appendix A which are marked with an L were 
obtained in this way. It is also possible to give a linear programming bound 
for T(w,, ni, w;,n;, d)- see Best et al. (140]. 


$5. The Griesmer bound 


For an [n, k, d] linear code the Singleton bound (45) says n z d - k — 1. The 
Griesmer bound, Equation (52), increases the RHS of this inequality. The 
bound is best stated in terms of 


N (k, d) = length of the shortest binary 
linear code of dimension k and 
minimum distance d. 


Theorem 23. 
Nik, dy d N (k-1.[2]) (51) 


Proof. Let € be an [N (k, d), k, d] code, with generator matrix G. Without loss 
of generality, 


«—— N (k, d) - d —9 «——— d —— 











G, has rank k — 1, or else we could make the first row of G, zero and € would 
have minimum distance less than d. Let G, generate an [N(k, d) - d, k — 1, di] 
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code. Suppose |u | v| € € where wt(u) = d;. Since |u | v| € €, we have 
d, * wt(v) 2 d, 
dit d — wt(v) 2 d, 


and, adding, 2d, >d or d, = [d/2]. Therefore 
Nk — 1, [d/2]) « N(k, d) — d. Q.E.D. 


Theorem 24. (The Griesmer bound [562]; see also Solomon and Stiffler [1257].) 


N(k, >> [2]. (52) 


Proof. [terating Theorem 23 we find: 


Nik, d) d +N (k- TE) 


= ($I Q.E.D. 


Examples. (1) Let us find N (5,7), i.e. the shortest triple-error-correcting code 
of dimension 5. From Theorem 24, 


N(5,7) 27 + (31 + [4] + [SI [iel 
=74+442+141=15. 


In fact there is a [15,5,7] BCH code, so N(5,7)= 15. 
(2) If d = 2‘"', then from Theorem 24 


N(k, 2) z2 ^ 427 4---4241=2"-1. 


In fact the [2% — 1, , 2*"] simplex code (see Ch. 1) shows that this bound is 
realized. 


$6. Constructing linear codes; anticodes 


Let S, be the generator matrix of a [2“ — 1, k, 2*] simplex code; thus the 
columns of S, consist of all distinct nonzero binary k-tuples arranged in some 
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order. As the preceding example showed, this code meets the Griesmer 
bound, as does the code whose generator matrix consists of several copies of 
S, placed side-by-side. 

An excellent way of constructing good linear codes is to take Są, or several 
copies of Są, and delete certain columns. (See Figs. 17.1a, 17.1b.) The columns 
to be deleted themselves form the generator matrix of what is called an 
anticode (after Farrell et al. [418], [420], [421]; see also [889], [1098]). This is a 
code which has upper bound on the distance between its codewords. Even a 
distance of zero between codewords is allowed, and thus an anticode may 
contain repeated codewords. 


Definition. If G is any k x m binary matrix, then all 2* linear combinations of 
its rows form the codewords of an anticode of length m. The maximum 
distance 6 of the anticode is the maximum weight of any of its codewords. If 
rank G =r, each codeword occurs 2*" times. 


Example. (1) 


generates the anticode 


O O m= = = -= OO 
O = O = -= O- oO 
o--oo--—o 


of length m =3, with 2’ codewords and maximum distance 6 —2. Each 
codeword occurs twice. Similarly one finds an anticode of length 3, with 2* 
codewords and maximum distance 2, for any k 22. 


2k 4 m 2 4- m 
GENERATOR MATRIX FOR GENERATOR MATRIX GENERATOR MATRIX 
SIMPLEX CODE. CODEWORDS FOR ANTICODE. FOR. NEW CODE,OF 
HAVE CONSTANT WEIGHT CODEWORDS HAVE LENGTH 2*-4-m, 
2k-1, MAXIMUM WEIGHT 5. DIMENSION <k, 


MIN. WT. 2571-6, 


Fig. 17.1a. Using an anticode to construct new codes. 
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s(2 m s(2 4)-m 
se fo | se g | | ; 1 ] 
s COPIES OF Sy. GENERATOR MATRIX NEW CODE. 
CONSTANT WEIGHT s2k~4 OF ANTICODE. MINIMUM WEIGHT 

MAX. WEIGHT 6. s2k-1_9, 


Fig. 17.1b. If the anticode has repeated columns. 


Using an anticode to construct new codes. First suppose the k x m generator 
matrix of the anticode has distinct columns. Then (see Fig. 17.1a) the columns 
of this matrix can be deleted from S,, leaving a generator matrix for the new 
code. Clearly the new code has length 2*—1-— m, dimension xk, and 
minimum distance 2*'! — 6. 


Example. (2) Deleting the columns of the anticode in Example 1 from S, we 
obtain a [2* — 4, k, 2^ — 2] code, provided k > 3. 

Let the new code € have dimension r, and suppose r « k — 1. A generator 
matrix for € then contains 2* — 1 — m distinct nonzero columns, no k of which 
are linearly independent. Thus 2*—1— m «2**!—1, so m 7 2**. Hence if 
m <2*"' the dimension of € is k. 

More generally, suppose s is the maximum number of times any column 
Occurs in the generator matrix of the anticode. Then the new code is formed 
by deleting the columns of this matrix from s copies of S, placed side-by-side 
(Fig. 17.1b), and has length s(2*— 1)— m, dimension <k, and minimum 
distance s2**! — 6. 

Further examples will be specified using the language of projective 
geometry (see Appendix B). The points of the projective geometry PG(k — 1,2) 
consist of all 2* — 1 nonzero binary k-tuples, and so are exactly the columns of 
S, A subset of these columns is thus just a subset of the points of the 
geometry. 


Examples. (3) Delete a line (—3 points) from S;: 
0001111 011 0011 
0110011/|[-/|101/[7/|0101 
1010101 110 1001 

$5 Anticode New code 
(a line) with n = 4, 
k =3,d=2. 


(4) For k = 5, form the anticode whose columns consist of the 15 points of 
a projective subspace of dimension 3 and a single point not on this subspace, 
as shown in Fig. 17.2. 
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0123456789 10 11 12 13 14 
I 111 1 1 1 I 





-— 
-— 
-— 
-— 
-— 

-— -— 
-— 

— -— — 


1 


Fig. 17.2. Generator matrix for an anticode with m = 16, k = 5, 6 —9. 





Deleting these columns from S; gives a [15, 5, 7] code meeting the Griesmer 
bound. 

(5) In S, choose a projective subspace B, of dimension 4 (a 4-flat) 
containing 31 points, and a 3-flat B, (15 points). These meet in a plane C; (7 
points). Choose a plane B; which does not meet Cs. (B; intersects B; in a line 
C; and B, in a point C,.) Choose a line B; which does not meet B, or B, (and 
meets B, in a point C1). Finally choose a point B, which does not lie on any 
previous B. The generator matrix of the anticode has as columns the 
310 15+7+3+1=57 points of B;, Bs, Bs, B; and B,. The columns of Cs, C», 
C, and Ci appear twice in this matrix. The maximum weight is 16+8+4+2+ 
1 — 31, and k = 6. We delete this anticode from two copies of Se, as follows: 


Delete B;, Ci, Bi Delete B4, B— Ci, B; 


Note that the points of C3, C}, C; and C1 get deleted twice, once from each 
copy of S. The resulting code has length n =2-63-—57=69, k=6, d= 
2-32—31 —33, and meets the Griesmer bound. 


Problems. (23) Use a line and a 3-flat in PG(5,2) to get an anticode with 
m = 18, k —6, 6 = 10, and hence find a (45, 6, 22] code. 
(24) Construct a (53, 6, 26] code. 


A General Construction (Solomon and Stiffler [1257], Belov et al. [102]; see 
also Alltop [25].) The following technique for forming anticodes from sub- 
spaces generalizes the preceding examples and gives a large class of codes 
which meet the Griesmer bound. 

Given d and k, define 


ix [5] and d-s2^- > 2e (53) 


where k 2 u>: > ul. 


Suppose we can find p projective subspaces B,,...,B, of S,, where B, has 
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2" — 1 points, with the property that 
every s +1 B/s have an empty intersection. (54) 


This means that any point of S, belongs to at most s B;'s. The anticode 
consists of the columns of B,,... and B,; no column occurs more than s 
times. The new code is obtained by deleting the anticode from s copies of S,. 
It has length 


p 
n-s(2-1)- Y (2“-1), 
i= 
dimension k, minimum distance given by (53), and meets the Griesmer bound. 


These codes are denoted by BV in Fig. 2 of Appendix A. 


Theorem 25. (Belov et al. [102].) The construction is possible & 


min(s 1. p) 


i-i 


Proof. (<) Let f(x) be a binary irreducible polynomial of degree k — u,, for 


i=1,...,p, and let A; consist of all muljiples of f;(x) having degree less than 
k. Then A; is a vector space of dimension u, spanned by f(x), 
xfi(x),...,x"" 'fi(x). B; can be identified with the nonzero elements of this 


space in the obvious way. Suppose 


mtin(s*1. p) 


u; S sk 
ii D 
We must show that if I is any (s + 1)-subset of {1,2,..., p) (if sci «p)or if 
I ={i,...,p} (if s+i>p) then Nie, B; = @. In fact, 


deg lcm fi(x)> k 


eJ (k-u)>k 


i€l 
e > ussk 
i€l 


min(s *1.p) 
€ X ussk 
i-1 
(since u; > ui: -), as required. 
(>) Suppose the construction is possible, i.e. (54) holds. If s + 1 >p there 
is nothing to prove. Assume s-- 1 p. From (54), for any (s+1)-set I we 
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have Nie: B; = Ø. But by Problem 27, 


Q B: =[exp: o Ui — sk) | =i 


Hence Zier ui = sk. Q.E.D. 


Corollary 26. Given d and k, there is an [n, k, d] code meeting the Griesmer 
bound if 


min(s+1.p) 
i-1 


where s, u1,...,u, are defined by (53). 


Note that the [5, 4, 2] even weight code meets the Griesmer bound. In this 
case s —1, d=2=8-—4-2, u,=3, ui—-2, p —2, and Zi.iu; > sk. So the 
conditions given in the Corollary are not necessary for a code to meet the 
Griesmer bound. Baumert and McEliece [87] have shown that, given K, the 
Griesmer bound is attained for all sufficiently large d. 


Research Problem (17.7). Find necessary and sufficient conditions on k and d 


for the Griesmer bound (52) to be tight. 
The above construction can be used to find the largest linear codes in the 
region n < 2d. 


Theorem 27. [Venturini [1369]; rediscovered by Patel [1027]. See Patel [1028] 
for the generalization to GF(q).] Given n and d, let 
d=x2 -5 a. for j =0,1,... (55) 
where x; is a positive integer and aj = 0 or 1, and let 
t= X,— 9, ay. (56) 
There is a unique value of j, say j = k, for which t, .,2 2d — n and t, «2d — n. 


Then the largest linear code of length n and minimum distance d contains 
B (n, d) = 2* codewords. 


The proof is omitted. To construct such an [n, k, d] code we proceed as 
follows. Set t= 2d — n. Then 


k-2 
lam Xa- D> aix- 7 E (57) 
i-o 
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n-2d-t 
k-2 . 
= (fi t)txcaQ - I DY aaQ"- 0, (58) 


from (55). Form the anticode consisting of projective subspaces B,,, (with 
2'*!— 1 points) for every i for which a,,-;= 1. Delete this anticode from x..i 
copies of S,, which is possible by (57). Finally, add ¢,_,—¢ arbitrary columns. 
The resulting code has length n (from (58)) dimension k, and minimum 
distance (at least) 


k-2 
x2 — J, aix-:2' =d, from (55). 
f 


Problem. (25) Show A(100,50) = 200, B(100, 50) — 32. Thus B(n,d) can be 
much smaller than A(n, d). 


Theorem 28. If d x 2' ^, and © is an [n, k, d] code which attains the Griesmer 
bound, then € has no repeated columns. 


Proof. Suppose € has a repeated column. Then the generator matrix of € may be 
taken to be 


st OR 
00 j 
Let @’ be the [n —2, k — 1, d'] code generated by G,. Then from (52) 


k-2 d' k—2 d 
"e Fae? [=| 


and so € does not meet the Griesmer bound. Q.E.D. 


Definition. An optimal anticode is one with the largest length m for given 
values of k and maximum distance ô. 


Suppose an anticode is deleted from the appropriate number of copies of a 
simplex code. If the resulting code meets the Griesmer bound, the anticode 1s 
certainly optimal. Thus the above construction gives many optimal anticodes 
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formed from subspaces. E.g. the anticodes of Examples 3, 4, 5 are optimal. 
Farrell and Farrag [420, 421] give tables of optimal anticodes. 


Theorem 29. Let xf be an optimal anticode of length m, with 2* codewords and 
maximum distance 6, such that no column appears in 4 more than s times. 
Suppose further that the code obtained by deleting 4 from s copies of S, meets 
the Griesmer bound. Then the new anticode sx’ obtained by taking every 
codeword of & twice is also optimal. 


Proof. By hypothesis 
k-1 k-i 
s(ž-1)-m=5 [27-3] (59) 
i-o 


From (59) and 





it follows that 


sQ"-D-m- x[2:31 


hence sS,., — £’ meets the Griesmer bound. Q.E.D. 


This theorem often enables us to get an optimal anticode with parameters 
m, k t 1, 6 from one with parameters m, k, ô. 

Not all optimal anticodes have a simple geometrical description. For 
example, the [t 1, t, 2] even weight code € meets the Griesmer bound. Thus 
$,— € is an optimal anticode with length 2'—:1—2, 2' codewords and 
maximum distance 2''— 2. From Theorem 29 we get optimal anticodes with 2* 
codewords for all k = t, and hence new codes meeting the Griesmer bound, as 
follows. 


Example. (6) 

15,4, 8|2[ 12,4, 6] (k—4) 

31,5,16]5 [ 28, 5, 14] (k — 5) 
31,5, 16J>[ 21,5. 10] (k — 5) 

[ 63,6,32]5[ 53,6,26] (k - 6) 


[ 63,6,32]2[ 38,6, 18] (k =6) 
[127,7,64] 5 [102, 7, 50] (k =7). 


t=3,m=3, ô= Ha 
[ 


t=4, m= 10, ô = e 


t=5,m=25, 8-14] 


The first two anticodes consists of the 3 points of a line. However the 
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others are not of the geometrical type constructed above. E.g. the second 
example consists of 10 points and 10 lines in PG(3, 2) with 3 points on each 
line and 3 lines through each point, as shown in Fig. 17.3. Figure 17.3 is called 
the Desargues configuration, since it arises in proving Desargues’ theorem - 
see Hilbert and Cohn-Vossen [654, Fig. 135, p. 123]. 


Research Problem (17.8). Is there a natural description of the geometrical 
configurations corresponding to the other optimal anticodes? 


Problem. (26) ((420]. Since an anticode can contain repeated codewords, 
anticodes may be “stacked” to get new ones. 

(a) Construct anticodes # and B with parameters m = 10, k = 5, 6 = 6, and 
m = 16, k — 5, 6 —9, respectively. 

(b) Show that the stacks 


EF and 


0 x s UB B 
lx 139 B 
(where 0 and 1 denote columns of 0’s and 1’s) form anticodes with m = 21, 
k =6, 6 = 12; and m = 33, k =6, 6 = 18. 

(c) Hence obtain [42, 6, 20] and [30, 6, 14] codes. 


As a final example, Fig. 17.4 lists some of the optimal linear codes with 
k = 6 (thus giving values of N(6, d)). 


0444 0410 





0104 


Fig. 17.3. The Desargues configuration. 
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Code anticode 
(63, 6, 32] — 
[60, 6, 30] a line 
[56, 6, 28] a plane 
[53, 6, 26] plane + line, or Fig. 17.3 
[48, 6, 24] 3-flat 
[45, 6, 22] 3-flat + line 
[42, 6, 20] see Problem 26 
[38, 6, 18] see Example 6 
[32, 6, 16] 4-flat 
{30, 6, 14] see Problem 26 
[26, 6, 12] see Baumert and McEliece [87] 
[23, 6, 10] see Baumert and McEliece [87] 
[18,6, 8] see Baumert and McEliece [87] 
[15,6, 6] (A BCH code) 
[11,6, 4] (A shortened Hamming code) 
[ 7,6, 2] (Even weight code) 


Fig. 17.4. Optimal codes with k = 6. 


Helgert and Stinaff [636] give a much more extensive table. 


Problems. (27) ((102]) For i= 1,...,t let €; be an (n, k:] binary linear code. 
Show that 


dim A €: = max fo, > ki — (t — Dn}. 


[Hint: induction on t.] 

(28) Show that if € is an ápamsl anticode with 8 even and m « 2* — 1, then 
an extra column can be added to the generator matrix to givean m + i, k, ó1 
optimal anticode. 

(29) ([889].) Let € have generator matrix [I| A] where A is a k x (n — k) 
matrix whose rows have weight < 1 and columns have even weight > 0. This implies 
3k 2 2n. Show that € is an [n, k] code with maximum weight k. 


$7. Asymptotic bounds 


This section treats asymptotic bounds, applicable when n is large. In this 
case it turns out that the simplest results are obtained if the rate of the largest 
code, 


R= 1 log: A(n, d) 


Ch. 17. $7. Asymptotic bounds 557 


is plotted as a function of d/n. Figure 17.7 shows the best bounds presently 
known. All codes lie on or below the McEliece-Rodemich-Rumsey- Welch 
upper bound (Theorems 35, 37), while the best codes lie on or above the 
Gilbert-Varshamov lower bound. The latter was given in Ch. 1, Theorem 12, 
and asymptotically takes the following form. 


Theorem 30. (Tne Giibert-Varshamov lower bound.) Suppose 0 « 8 «à. Then 
there exists an infinite sequence of [n, k, d] binary linear codes with din 6 
and rate R = k[n satisfying 


Rzi-H; (3. for all n. 


Notation. f (n) € g(n) as no means f(n) <g(n)(1 + e(n)). where |e(n)| 5 0 as 
n+, Also Hx) = — x log, x —(1— x) log, (1 — x). (See $11 of Ch. 10). 


Proof. From Theorem 12 of Ch. | and Corollary 9 of Ch. 10. Q.E.D. 


There are alternant codes (Theorem 3 of Ch. 12), Goppa codes (Problem 9 
of Ch. 12), double circulant or quasi-cyclic codes ($7 of Ch. 16) and self-dual 
codes ($6 of Ch. 19) which meet the Gilbert-Varshamov bound. (See also 
Research Problem 9.2.) Often the following very simple argument is enough 
to show that a family of codes contains codes which meet the Gilbert- 
Varshamov bound. 


Theorem 31. Suppose there is an infinite family of binary codes %,, ®2,..., 
where ®; is a set of [n; ki] codes such that (i) kín; > R and (ii) each nonzero 
vector of length n; belongs to the same number of codes in ®,. Then there are 
codes in this family which asymptotically meet the Gilbert- Varshamov bound, 
or more precisely, are such that 


Rz1-H; ($) for all n. 
Proof. Let No be the total number of codes in ®, and N, the number which 
contain a particular nonzero vector v. By hypothesis N, is independent of the 
choice of v, hence 
(2% — DN, = (2* — I)No. 


The number of vectors v of weight less than d is 
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hence the number of codes in d; with minimum distance less than d is at most 


Provided this is less than No, i.e. provided 
d-t n; 
2 ( 7) « Q^ - 1)/(2" — 1), 
j=1 


@, contains a code of minimum distance > d. The result now follows from 
Corollary 9 of Ch. 10. Q.E.D. 


Problems. (30) Show that there are linear codes of any given rate R which 
meet the Gilbert-Varshamov bound. [Hint: take d; to be the set of linear 
codes of length i and dimension k; = (Ril.] 

(31) (Kosholev [779], Kozlov [781], Pierce [1044]; see also Gallager [464, 
465]. Let G bea k xn binary matrix whose entries are chosen at random, 
and let €(G) be the code with generator matrix G. Show that if k/n is fixed 
and nœ, €(G) meets the Gilbert-Varshamov bound with probability ap- 


proaching 1. 
If d/n = 1/2 then it follows from the Plotkin bound that (1/n) log; A(n, d) 50 
as n ^, so in what follows we assume djn < 1/2. 


We shall give a series of upper bounds, of increasing strength (and 
difficulty). The first is the sphere-packing bound, Theorem 12, which asymp- 
totically becomes: 


Theorem 32. (The sphere-packing or Hamming upper bound.) For any 
(n, M, d) code, 
d 
R<=1-H, x) as n>, 
2n 


The next is the Elias bound, which depends on the following simple result 
(A(n, d, w) was defined in 31): 


Theorem 33. 


Ec for all OS w Sn. 
J 


Proof. Let € be an (n, M,26) code with the largest possible number of 
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codewords, i.e. M = A(n, 28). For u, v € F" define 


1 ifu € € and dist(u, v) = w, 
0 otherwise. 


x(u, »=f 


We evaluate in two ways the following sum. 


à 2 Xu r= 5 (2) = (7) A(n. 28) 


u€F" ve ues NW 


=> OM xuv) 


v€F" ucF 


x x A(n, 26, w) = 2" A(n. 28. w). 


vEF' 


Q.E.D. 
Theorem 34. (The Elias upper bound [1192].) For any (n, M, d) code, 


i i 2 
R<1- H (1-3 (1-22). sa (60) 


Proof. From Theorem 2, A(n, 24, w) « ón[(w? — wn + ôn). Set 
Ww — wo [;- (i- 8»), 


2 2 n 
Then A(n,26,w,) € 6, and from Theorem 33, A(n— 1,26 — 1) - A(n, 26) 
2^6/(2). Now (60) follows from Ch. 10, Lemma 7. Q.E.D. 


Finally we come to the strongest upper bound presently known. This is in 
two parts; we begin with the simpler. 


*Theorem 35. (The McEliece-Rodemich-Rumsey- Welch upper bound I [947].) 


For any (n, M, d) code, 
1 d d 
ga «n. (1- y4 (1- 2). (61) 


Proof. This will follow from the second linear programming bound (Corollary 
21), with the proper choice of B(x). Set 


a(x) = (Pi GP. (2) - PG)P, ly (62) 


where a and t will be chosen presently. From the Christoffel-Darboux 
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formula (Ch. 5, Problem 47), 


PaP. (a) - PPa (a) = 268 (7) yy POP MO) 
(i) 
thus 
P.(x)P.(a) 


() 


Since a(x) is a polynomial in x of deg 2t+ 1, it may be expanded as 


a(x) = (7) tei. 08.) - P.G)P.. (4) 9 (63) 
+1 


a(x) = > a,P,(x). 
We will then choose 
B(x) = a (x)lae. 


In order to apply Corollary 21 it must be shown that 


(i) a(i) 0 fori=d,...,n, 
(ii) a;z0 fori-lí,...,n, 
(iii) Qo 0 (so that Bo = 1). 


From (62), a(i)=0 if x >a, so (i) holds if a<d. Since the Krawtchouk 
polynomials form an orthogonal family (Ch. 5, Theorem 16) it follows from 
Theorem 3.3.1 of Szegó [1297] that 

(a) P,(x) has t distinct real zeros in the open interval (0, n); and 

(b) if x? <- - -< x? are the zeros of P,(x), and we set xt? = 0, Ae =n, then 
P(x) has ixi one zero in each open interval (x^, x2), i= ,t (see 
Fig. 17.5). To make (ii) and (iii) hold, we choose a in the range xí e « a « xi". 





Fig. 17.5. The zeros of Krawtchouk polynomials are interlaced. 
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Then for OSi St 
P,(a)P,(a) > 0, P,..,\(a)P;(a) « 0. (64) 


Now from Corollary 4 of Ch. 21, it foliows that, for k - 1 «n, 
P.(x)Pi(x) = 9 cus Ps (x), (65) 


where the coefficients Cum are nonnegative integers. From (63)-(65) the 
coefficients o, are nonnegative, proving (ii). 
From Ch. 5, Theorem 20, 


which simplifies to (using (63)) 
2 [fn 
= — Tl] (7) P,.,(a)P,(a) > 0, 


and proves (iii). So provided 

a«d and xi*)<a<x}”, 
Corollary 21 applies, giving 
a(0) 


Qo 


ao ei) re C) nem] 


(") {n-~t-(t+ DOF, 
es (66) 
—2a(t * 1)Q i 


Ain, d) « B(0) = 


where 


Next we need a result about the asymptotic behaviour of xj? as n >œ and 


the ratio t/n is held (approximately) fixed. 


Lemma 36. If t — [An], where 0« A <i, then 


Lapet- VA - A. as n>, (67) 
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Proof of Lemma. Suppose (67) is false. Then there is a fixed e with 0< e< 
VA(1— A) and an infinite sequence of n’s such that x" = n(r * 2e), where 








r —1— VA(1— A). For each n in the sequence set t = t(n) = [An]. Then 


P,(x) = (- 2! (x — xi: (x xt, 


Pid) i 
log. 7p. à =) log (14; inm) 


k=] 





Set i= i(n) =[n(r+)], so that |i — x{?|= |i — xi] = en, and 


i 1 i 
log. (1+; r) iu +0(5), 


Pitt) _ i 1 
log, = X; =+0 G). 





“A i-— xf 


Similarly 





PA A 1 
log. Ba X) 


k=l 


Hence 





P+D , BG nfi 
P) les ga = OF ) 


"AG "Edo (I+ O(n) 


log. 


Now from Problem 46 of Ch. 5, 
(n — )P.(i* 1) 7 (n 20 P.(i) - iP. (E — i), 
cat P.(i +1) P,(i) 





(n — i) P.) "bacon O Opa p Y 


If we set p = P,()/ P.(i — 1) then from (69), (70) 

(n — Dp? - (n - 2t)p +i+ p’O(1) = 0. 
From (68), p = e" (1-- O(1/n)), so 

(n — Dp! — (n - 20p t i - O(1) - 0. 
Since p is real, we must have 


(n —2ty — 4i(n — i) + O(n) = 0, 


A-2} 4 + ey -r-e)+0(7)> 


(68) 


(69) 


(70) 


(71) 
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But (1 —2Ay = 4r(1 — r), so 
e’— eli -2n«o(t)» 0. 
But this is false for large n, since 
e < VA(1— A) = (1—27)]2. 


This completes the proof of the Lemma. 

It is possible to show by contour integration that (1/n)x"! —i1— VA(1— A) 
as n >œ, but to prove Theorem 35 a crude lower bound is sufficient. Suppose 
t«n[2. Since P,(0)=()>0 and P,(1)-1-— (2t/n)) »0, it follows from 
Theorem 3.41.2 of Szegó that P,(x) 2 0 for 0 «x « I. Hence 


xz. (72) 


We can now choose a and t in the proof of the theorem. From Fig. 17.5 


there is a value of a in the range xí""«a «xf, say a= ae, for which 


Q = P. (ao) P.(ao) = — i. Set a = ao. From the lemma if we choose t = [An] 
where A is such that 


1 od 
3^ VA eS 


ie. if 


Í d d Í 
Ives (73) 


then xf? < d for n large. Using these values we get, from (66), 


(n + 1y (?) 


AS oe Dirt 


so 


R = Slog: A(n, d)« H, G) using (72) and Ch. 10, Lemma 7, 


<H, G- "E (1 e: 2) ftom (73). Q.E.D. 


The same kind of argument can be applied to the linear programming 
bound for A(n, d, w) (see p. 545). Combining this with Theorem 33 gives: 


Theorem 37. (The McEliece-Rodemich-Rumsey- Welch upper bound II [947].) 
For any (n, M, d) code, 


Rx B(S), ds as (74) 
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where 
B(6) = min B(u, 6), 
0cuxi-268 
and 
B(u, 8) = i + h(u?) — h(w? + 26u + 26), 
h(x) = Hià-$V(1— x)). 


The proof is omitted. 
Notice that B(1— 28, 8) = H-Ġ— V(8(1— 8))), so 


s (en G- (é (1-9). 
n 2 n n 
and Theorem 37 is never weaker than Theorem 35$. In fact it turns out that for 
din = 0.273, B(d/n) is actually equal to 


1 d d 
nG- V Gea) 
and in this range Theorems 35 and 37 coincide. For d/n < 0.273, Theorem 37 is 


slightly stronger, as shown by the tabte in Fig. 17.6. The best upper and lower 
bounds are plotted in Fig. 17.7. 


Gilbert- Sphere- 

Varshamov packing Elias McEliece-Rodemich- 
lower upper upper Rumsey-Weich 
bound, bound, bound, upper bounds 

d 

n  Theorem30  Theorem32 Theorem34 Theorem 35 Theorem 37 
0 Í Í Í Í Í 
0.1 0.531 0.714 0.702 0.722 0.693 
0.2 0.278 0.531 0.492 0.469 0.461 
0.3 0.119 0.390 0.312 0.250 0.250 
0.4 0.029 0.278 0.150 0.081 0.081 
0.5 0 0.189 0 0 0 


Fig. 17.6. Bounds on the rate R of the best binary code as a function of d/n, for n large. 


Research Problem (17.9). What is the true upper bound on (1/n) tog; A(n, d) asa 
function of d/n, as n — o? 
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0.9 


0.8 


Mc ELIECE - RODEMICH - RUMSEY - WELCH 
0.7 UPPER BOUND 


06 
0.5 
0.4 


0.3 


GILBERT - VARSHAMOV 
LOWER BOUND 





% 01 0.2 0.3 0.4 0.5 
d/n — 


Fig. 17.7. Asymptotic bounds on the best binary codes. 


(At present it is not even known if the limit 


lim 1 log: A(n, d) 


no 


exists, when d/n is fixed and between 0 and 3.) 
It is possible to obtain an asymptotic bound for Corollary 21 which applies 
just outside the region where the Plotkin bound is tight. 


Theorem 38. (McEliece [944].) For any positive j satisfying j = o(\/d), we 
have 


A(2d + j,d)<2d(j+2), asd. (75) 


Proof. This time we choose a(x) to be a cubic, with roots at d, | and /+1, 


where 
z i 12) 
i~ (a+ A (E) 
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Then 
a(x) 7 (d - xyl - x) - 1 — x) 
= Qo + aiP,(x) + a2P,(x) + a3 P sx), 
where 
a 4*2) " d? 
omat TGD” 
saaa -3 
ayp Cp 


We take B(x)=a(x)/ao. Since the o; are positive, and B(i)<0 for i= 
d,...,2d+j, Corollary 21 applies, and so 


A(2d + j, d)<a(0)/ao ~2d(j +2) asd, Q.E.D. 


For j = i (75) gives A(2d + i, d) € 6d, weaker than Theorem tt(b). But for 
j=2,3,... (75) is the best asymptotic result known. 


Notes on Chapter 17 


$1. B(n, d) is related to the critical problem of combinatorial geometry — see 
Crapo and Rota [315], Dowling [385], Kelly and Rota [754] and White [1411]. 


$3. The purported proof that A(9,3) « 39, A(10,3) « 78 and A(11,3) « 154 
given by Wax [1391] (based on density arguments of Blichfeldt [163] and 
Rankin [1089]) is incorrect - see Best et al. [140]. 


$4. McEliece et al. (unpublished) and Best and Brouwer [139] have in- 
dependently used the linear programming method to show that 


AQ" —4,3) = 277m. 


Methods for combining codes 


§1. Introduction 


This chapter describes methods for combining codes to get new ones. One 
of the simplest ways to combine two codes is to form their direct product (see 
Fig. 18.1), and the first part of this chapter studies product codes and their 
generalizations. We use the informal term "product code" whenever the 
codewords are rectangles. After defining the direct product (in 82) we give a 
necessary and sufficient condition for a cyclic code to be direct product of 
cyclic codes ($3). Since this is not always possible, $84, 5 and 6 give some 
other ways of factoring cyclic codes. 

The construction in $4 takes the direct product of codes over a larger field 
and then obtains a binary code by the trace mapping. We saw in $11 of Ch. 10 
that concatenated codes can be very good. Of course a concatenated code is 
also a kind of product code. $5 studies concatenated codes in the special case 
when the inner code is an irreducible cyclic code- this is called the * 
construction. Finally $6 gives yet another method of factoring, due to Kasami, 
which applies to any cyclic code. This works by expressing the Mattson- 
Solomon (MS) polynomials of the codewords in terms of the MS polynomials 
of shorter codes. 

The second part of the chapter gives a number of other powerful and 
ingenious ways to combine codes; a summary of these techniques will be 
found at the beginning of that Part (on page 581). 
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PART I: Product codes and generalizations 


$2. Direct product codes 


Let 4% and B be respectively [n,, kı, di] and [n2, ka, d;] linear codes over 
GF(q). Suppose for simplicity that the information symbols are the first k, 
symbols of %4 and the first kə symbols of B. 


Definition. The direct product 4 &) B is an [nın kk, did;] code whose 
codewords consist of all n; x n; arrays constructed as follows (see Fig. 18.1). 


INFORMATION 
SYMBOLS 


CHECKS ON CHECKS ON 
COLUMNS CHECKS 





Fig. 18.1. Codewords of the direct product code ¥ G9 B. 


The top left corner contains the k,k. information symbols. The first kz 
columns are chosen so as to belong to £, and then the rows are completed so 
as to belong to &. This is also called the Kronecker product of # and &, and 
is the simplest kind of product code. The columns are codewords of », and 
the rows are codewords of &. 


Example. (1) The direct product of the [3, 2, 2] binary code with itself is the 
[9, 4, 4] code consisting of the 16 arrays shown in Fig. 18.2. 


Problems. (1) Verify that x ® B is a linear code over GF(q) with parameters 
[nir5, kik2, dida]. 

(2) The codewords could be also formed by first completing the top k, 
rows, then all the columns. Show that this gives tue same code. 


More generally, let % and B be arbitrary [n;, kı, di] and [7 k2, d2] linear 
codes, without assuming that the initial symbols are the information symbols. 
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(3, 2, 2] & (3, 2, 2] = [9, 4, 4] code 
000 090 000 000 
000 01i 101 110 
000 011 101 110 
011 01i 011 011 
000 011 101 110 
011 000 110 101 
101 101 101 10! 
00 011 101 110 
10 110 000 011] 
110 110 110 10 
000 011 101 11€ 
110 101 011 000 
Fig. 18.2. 


Then the direct product 54 69 B is defined to be the [n,n2, kik2, did;] code 
whose codewords consist of all n; x 7. arrays in which the columns belong to 
& and the rows to B. 

Unfortunately direct product codes usually have poor minimum distance 
(but are easy to decode). Sometimes the rectangular form of the codewords 
makes them useful in certain applications - see Notes. 


Problem. (3) Let G, and G, be generator matrices for # and B respectively. 
Show that the Kronecker product G:® G; (defined in $4 of Ch. 14) is a 
generator matrix for # (x) B. 


Suppose s/ and & are cyclic cones, as in Example 1. Then the direct 
product # (9 B is invariant under cyclic permutation of all the rows simul- 
taneously, or all the columns simultaneously. We shall represent a typical 
codeword 

Aoo Ao *** Qoni ] 


E E ET. l Y 
Q5,—1.0 tt Os-1à 


of J 69 B by the polynomial 
n,-1nj-1 


fxy=> 2, a,x'y’. (2) 


i=0 j= 


If we assume x" =1 and y™= 1, then xf(x, y) and yf(x, y) represent cyclic 
shifts of the rows and of the columns, and also belong to # (x) B. In other 
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words, #® is an ideal in the group algebra € of the abelian group 
generated by x and y with x^ — y^- 1. 

Suppose n, and n: are relatively prime. Then by the Chinese remainder 
theorem ($89 of Ch. 10) for each pair i,, i; with O< i, < n, 0 i < n, there is a 
unique integer (i, i?) in the range 0< I(i,, i) « nin; such that 


I(i,i)-i (mod ni). 

I(i,i)-i, (mod nə). (3) 
This implies that we can rewrite f(x, y) in terms of the single variable z = xy, 
by replacing each term x‘ y® by z'“:. (In this case 4 is the group algebra of 


the cyclic group of order n, n; generated by z.) 
For example if n, —3, n; — 5 the values of I(i,, iz) are: 


(i i) 





The codeword f(x, y) becomes 
f(x, yY)= Qw taaz? + agz" + agz? + aoz’ 
+ diz" + auz + anz’ + anz” + auz’ (4) 
+a + Anz" + a22 + anz” + daz". 


Theorem 1. If % and B are cyclic codes and (n,,n2)=1 then € = A () B is 
also cyclic. 


Proof. As shown above, a codeword f(x, y) € € can be written as g(z) where 
z ^ xy. Also if f(x, y) € €, yf(x, y) € €, hence xyf(x, y) = zg(z) € €. Therefore 
€ is an ideal in the group algebra of the cyclic group generated by z. Q.E.D. 


Example. (2) Let % and B be the [3,2,2] and [5, 4,2] even weight codes. 
Since (3,5) 1, € — 34 69 88. is a cyclic (15,8, 4] code. Some typical code- 
words of € are 


01010 00000 11000 
u=/10010 v-|01111 w=!101100 
11000) 01111J 10100] 


The first four columns are arbitrary codewords of « and the last column is 
their sum. Equation (4) gives the cyclic representation of €. For example, the 
idempotent of € is (cf. Problem 7 of Ch. 8) 


0,*0,—z t 274+ 2442774224 2"4727 42", 
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and the generator polynomial is 1+ z+z?+2°+2°+2z’. These are the code- 
words v and w respectively. 


Problems. (4) Suppose (n,, n; 2 1 and E,(x), Ex(y) are the idempotents of sf 
and B. Show that the idempotent of « (9 B is obtained from E,(x)E.(y) by 
replacing xy by z. 

(5) Suppose (n,, nj) 7 1. If Bi,..., Bx, and y,..., v are respectively the 
nonzeros of sé and &, show that the k,k, nonzeros of © G9 B are Biy; for 
Izixzxk,lzjzk. 

(6) Take 54 and B to be respectively [3, 2, 2] and [7, 3, 4] codes. Find the 
parameters, weight distribution, idempotent, and generator polynomial of 
A (x) 98. 

(7) Other choices for z besides z = xy may make of ® B into a cyclic code. 
When do different choices give a different cyclic code? [Hint: for the code of 
Problem 6 show that z = Xy and z = xy’ give different results.] 

Sometimes knowing that a code is a direct product enables one to find its 
weight distribution. 

(8) (Hard.) Let % and B be [2" — 1, m,, 2" '] and (2^ — 1, mz, 2"* '] simplex 
codes, where (m,,m2)=1 and m; m;. Show that L 69 B has m distinct 
nonzero weights, namely w; = (27: — 2": )2"-' for i = 1,..., ms. Show that the 
number of codewords of weight w; is 


F(m;, i2: — 12^ —2)--- (2-274), 
where F(r, k), r= k, is given by the recurrence 


F(r,0)7 1, F(r, r)* |, 
F(r* i,k) = F(r,k — 1) - 2*F(r, k). 


*$3. Not all cyclic codes are direct products of cyclic codes 


For example, suppose € is a nondegenerate irreducible [n, k] binary cyclic 
code with n = nin; (nı, n;) - 1 and n; l, n; 1. We shall show that only in 
certain cases is it possible to write € = 4 G9 B where & and B are cyclic 
codes of lengths n, and ny. 

Let a, a’, a^,..., a" ' be the nonzeros of €, o a primitive n" root of unity in 
GFQ^). Since (n,, n2) = 1, by 88 of Ch. 12 there exist integers a, b such that 
ani * bn;- 1. Define B = a” and y = a^", and let k,, k- be the least integers for 
which 8 € GF(2*), y € GF(Q*). 


Theorem 2. There exist cyclic codes 4 and B, of lengths n, and n, such that 
€ =A G9 98 iff (ki, k2) 7 1. (Then the dimensions of 4 and 98 are k, and kn.) 
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Proof. Making the substitution z= xy puts any codeword g(z) € € into an 
ni X n; array described by f(x, y). Note that 


z^ = xay = y^^ = yy an = y 


and similarly z’ = x. Then z*"g(z) = yf(x, y) € €, and similarly xf(x, y) € €. 
Therefore the columns of the arrays obtained in this way form a cyclic code 
5&4 (say), and the rows form a cyclic code &. 

Now B" = y" = ], and a = By. So the nonzeros of € are 


By. (ByY, ..-, (By). 


Lemma 3. The nonzeros of sí and B are B, B, ..., B^ and y, y, ..., y^ 
respectively. 
Proof. Let y, € GF(2*) be an n;" root of unity not in the set (y, y... ., y" p 


and let B, € GF(2*) be any n," root of unity. Thus fy, is a zero of €. Let 
feo y) = roy) + xn) * cot x" aly) 


be a nonzero codeword of €, where r,(y) is the i™ row. Then 
n,-l 


f(B.y)7 È Bin(y)=0 


holds for all n; choices of B, giving n, linear homogeneous equations for the 
n, quantities r,(y,), O <i  n,— 1. Since the coefficient matrix is Vandermonde 
(Lemma 17 of Ch. 4), hence nonsingular, r(y,) = 0 for all i. Therefore y; is a 
zero of the row code 9. The nonzeros of 28 must therefore be as stated. 
Similarly for f. Q.E.D. 


Clearly k, and k, divide k, and in fact if s is the g.c.d. of k, and k2 then 


pakk 
S 


l.c.m. {k,, kj. 
The theorem now follows easily. If (kı, k;) = 1 then k = kik; and € = 4 (X) B. 
On the other hand if € = 4 ® B then k = kik,, so s— I. Q.E.D. 


Examples: (3) Let € be the [15, 4, 8] cyclic code with idempotent z + a + Z a 
z+ z^-z*uv4z*-z". Let n,=3, n= 5. The nonzeros of € are a, a’, a^, a? 
with a” = 1. Then a 2-2, b--1, 8 —- a", y7a5, B*= f, y =y, and k, =2, 
k,=4. Since 2 and 4 are not relatively prime, €# 54 (x) B. Indeed, 4 and B 
are respectively (3,2, 2] and [5,4,2] codes, and # ®& is a (15,8, 4] code. 

(4) Let € be the [21,6,8] code of Problem 6, with nonzeros 
a,a’,a*,a*,a",a" where a? =1. Then n,=3, 7-7, -—23+7=1, B=a’, 
y=a", k,=2, k,=3, and indeed € = » Q) B. 
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§4. Another way »f factoring irreducible cyclic codes 


With the same notation as in $3, if (k,\k.)=s>1 then 6# 4 (9 2. 
However, € can be factored in a different way: 


Theorem 4. It is always possibie to find irreducible cyclic codes sd and 9 over 
GF(2‘) such that 


sd = T. (sd), 

B = T.(À), (5) 
and ; . 

€ = T.(st Q B). (6) 


As in Ch. 10, if D is an [n, k, d] code over GF(2°), the trace of 9, T.(9), is the 
[n, k's sk, d' « d] binary code consisting of all distinct codewords 


(T,(u),..., T.(u,)) where (u,...,u,)€ 9. 


Proof. To find $4 and Ê we proceed as follows. The nonzeros of € are 
a,a’,...,a”"', corresponding to the cyclotomic coset C,= 


(91,265.55: 00690, 
C; = 2, 25 font Qs yer 


C RS 2:1 203 fe Pha y 


The idempotents over GF(2') associated with these cosets are defined by 


ó h= l, l € Cx, 
$22) lo otherwise, 
for i-0,...,s— I. Then by Theorem 6 of Ch. 8, 
n-1 
6x (z) = 2, €,z', (7) 
£ 
where 
€j = 5 a 5. (8) 
1€€;i 
The idempotent of € itself is 
n-l 
6(2) = >) ez’, (9) 
j=0 
where 
a=) a". (10) 
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Then from the above definitions 
e(z) = 5 ĝu (z) 

=F ne for all i=0,...,s—-—1.. (11) 

Thus € = T. (x), 0<i<s—1. Now @; is an [n:n k/s] code over GF(2) with 

nonzeros a,a”,...,a” *. We apply the technique of the previous section to 


€,. As before, B = a^*, y = a^^, where B € GF(q""), y € GF(q*") and q = 2". 
Then a codeword of €, can be written as an nix n array, in which each 


column belongs to an [n,, ki/s] cyclic code of; with nonzeros £, B^. zu s 
and each row to an [m, k;/s] cyclic code &,, with nonzeros y, y”,..., y^^ 
Since k,/s and k;/s are relatively prime, €, = £i) ĝı, and 

€ = T.i. O Â). Q.E.D. 


Example. (3) (cont.) Let € be the [15, 4, 8] code discussed above, with s —2. 
Set w =a’; œ is a primitive element of GF(2). Then 6.(z), 62(z) are as shown 
in Fig. 18.3 






5 7| 8 | 9 |j10]11[|12 
Bese Eo) 





8, 
sum = 
trace 





Fig. 18.3. 


Arrange the idempotent 6; in a 3x 5 array 








Thus 6,(x, y) = (19 wx + wx Awy + ey! * wy ++ o y^), which is the product 
of the idempotents of »/, and &,. These are [3, 1,3] and [5, 2, 4] codes over 
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GF(2). Similarly 62 gives the array 








and 62(x, y)7 (0t ox * o 2x’ (wy + e?! y? t e?! y! t wy’). Clearly g= 
Tf, ® B) = TAA O B2). 


Problems. (9) Show that Tx(%,(® Ê) and TA% Q &;) are both equal to the 
other [15, 4, 8] code. 

(10) Use the above method to factor the nondegenerate irreducible codes 
of length 63 = 7.9. (The necessary idempotents are given in Fig. 8.1.) 


$5. Concatenated codes: the * construction 


Any concatenated code (Ch. 10) is a kind of product code. If the inner code 
is irreducible and n,,n2 are relatively prime then we shall see that the 
concatenated code is cyclic. We call this the * construction. 

Two codes are needed. (i) An [m, ki, di] irreducible cyclic binary code „%4 
with (say) nonzeros B, 8^, B*,..., B?" ', where B € GF(2") is a primitive n," 
root of unity; and (ii) an (12, k2, d2] cyclic code 8 over GF(2"). Recall from 83 
of Ch. 8 that x is isomorphic to GF(2"). There is an isomorphism q from sx 
to GF(2") given by 


a(x)* = a(B), 
with inverse map v which sends ô € GF(2") into the binary n,-tuple 
(8)* = (T4(8), T.(68 ), T.(68 ?)...., TB ITI € x. (12) 


The * construction. The new Pa which is denoted by »/ * &%, is formed by 
taking each codeword (ôo, . ..,6,,.:) of Ê and replacing every 6; by the binary 
column vector obtained by transposing (5.)". Thus s£ * Ê is an [nins kik] 
binary code with minimum distance = d,d,, whose codewords are n,xn; 
arrays. In other words, % * & is formed by concatenating the inner code # 
with the outer code &. 


Example. (5) Let 54 be the [3, 2, 2] code which is isomorphic to GF(2?) under 
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the mapping 
GFQ) — 


A 
00 
01 
10 
11 


ome © 


0 
1 
@ 
2 
w 


Let $ be the [5,2, 4] code over GF(2ô with idempotent 
wy + wy? + wy + w y’. 


Then Ê consists of the codewords 


000 0 0 
0 ww wo w 
wo 0 www 
2 2 
o o 0 o v 
2 2 
o o w w 
2 2 
o o o o 0 
and their multiples by w and w°’. Then sí * & is found as follows: 
01111 
00 /0o0o-|01001 
00110 
00011 
1019 > 10111 
10100 


et s] s] s t nt n9 


and is our old friend the [15, 4, 8] code. 


(13) 


Recall that the idempotent E(x) of & maps onto the element 1 € GF(2*). If 
of is a simplex code then a typical codeword x'E(x) is mapped onto B'. But in 
general we also need to know (£)", where £ is a primitive element of GF(2"). 


Problem. (11) Show that the idempotent of Ê maps onto the idempotent of 


A * B. 


Theorem 5. If (ni, nj) 1 then sí * È can be made into a cyclic code by the 


transformation used in Theorem | (i.e. by setting z = xy). 


Proof. Let (1) be a typical codeword of 4% * 8, obtained from (êo, .. 


s ni) E 


ĝ. As in Theorem 1 it will follow that £ * Ê is a cyclic code if the cyclic 
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shifts 
Oo.n;-1 Qo,n2-2 An,-1.0 Qn,-1.;-1 
Qo OT Me ahd ausa aee ers and aoo Ao.nz-1 
An,- n2-i Qn,-1,5-2 n,-20 ün,-2 n-i 


are in x * à. In fact these are obtained from the codewords 
(85,-1, 50, ..., 04,2) and (BSo,..., BSn.-1) (by Lemma 10 of Ch. 8). Q.E.D. 


Example. We choose dà to be a cyclic MDS code, for example the [9, 4, 6] 
code over GF(2) with zeros a ?, a^*, 1, a, a’, where a is a primitive element 
of GF(25). This has weight distribution (see Ch. 11) 


i06 7 8 9 
A: 1 588 504 1827 1176 


Take x to be the [7,3,4] simplex code. Then s4 * & is a cyclic (63, 12, 24] 
code with weight distribution 


i 024 28 32 36 
A: 1 588 504 1827 1176 


Theorem 6. Assume (ni, n;) — 1. The kik; nonzeros of £ * B are 
z-B'"4", forj=0,1,...,ki:-1,i=1,...,k, 


where B, B’,..., pre and y,,..., ya are the nonzeros of A and B respec- 
tively. 


Proof. A typical codeword of s * & is 


r(y)* xr(y) t - +x" ray). 
Now 


roy) + B'r(y) (8^) ty) = 80 Ery to + Say", 


where (85,81,...,6,-) E Ê. Also (80, . oe 82,.:) is a codeword of the [n2, k2] 
cyclic code ĝ’ with nonzeros y1,..., ye. Thus ^y? is a nonzero of of « B. 
Q.E.D. 


Example. (5) (cont.) Let a be a primitive element of GF(2*) and set w =a’. 
Then the nonzeros of 4, B and £ * B are B =a’, 8^ a ^; y. a, ys a5 
and a/:a?- a?, a^: a"? =a’, a”: a’ =a, a" a? — a*. a, a’, a^, a? are 


indeed the nonzeros of the (15, 4, 8] code. 


Problem. (12) Show that any irreducible [n,n2, k] binary code € with (ni, n2) = 
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1, ni» 1l, n; l, can be factored as € = 4 * BY. [Hint: Take ĝ to be the 
[n k/k;] irreducible code over GF(2") with nonzeros y^", i =0,1,...,k/ki—1.] 


*$6. A general decomposition for cyclic codes 


This section gives a method of Kasami [731] which expresses the Mattson- 
Solomon polynomials of any cyclic code of composite length in terms of MS 
polynomials of shorter codes. (The methods of $83-5 do not always apply to 
reducible codes.) 

Let € be an [n = nım, k, d] binary cyclic code, where (ni, nz) = 1, ni 1l, 
n; 1. (The same procedure works over any finite field.) 


Notation. As before, an; bn;—1, a" — 1, B — a*", y — a^^. Also I(i,j) is 
defined by (3). 


Case (I). € is irreducible, with say nonzeros a’, i€ C,, where C, is the 


cyclotomic coset (1, 2, 22, 2, ..., 2^") mod nını. Write 
Ci = {I (ur, vi)... Eas vx))- 
Then pu,..., p consists of several repetitions of the cyclotomic coset 


Ci = (4,2, 2, ...,2^] mod n.. 


The set (v: I(1, v) € Cj) consists of (1, 2", 2,...,2*") mod no. 
For example, if € is a [63, 6, 32] code then n = 63, n, = 7, n; — 9, k = 6, and 
C; (1, 2, 4, 8, 16, 32) 
= (I(1, D, IQ, 2), I(4, 4), I(1, 8), I2, 7), I(4, 5)), 
Ci- {1, 2, 4}, r= 3, 
(v: I1, v) € C) = (1,8). 
Let Ss be the [n,,r] irreducible binary cyclic code with nonzeros 
B, B^,..., B7, and B the [n2 k — r+ 1] irreducible cyclic code over GF(2’) 
with nonzeros y, y", ..., y". 
The MS polynomial of a codeword of # is 
F(Y) = TA£Y `’), 
where £ is an arbitrary element of GF(2’). The MS polynomial of a codeword 
of B is 
$(Z)-cZ'*(cZ?y' +--+ (eZ, 


where c is an arbitrary element of GF(2*). 
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Theorem 7. (Kasami.) The MS polynomial of any codeword of © is obtained 
from F(Y) by replacing £ by (Z) and setting X = YZ and Y" = Z™ =]. 


Proof. Let F(Y, Z) be the polynomial obtained in this way. Then 
F(Y, Z) = T(O(Z)Y ``) 
= Talex 1) = T«(cX ^) 


after some algebra. But this is the MS polynomial of a codeword of €. 
Q.E.D. 


For the preceding example, 


F(Y, Z) = TXcY"Z^ * cY Z) 
= Tx(cX S+ 8X 10-8) 
= TécX-*, c € GF(25). 


Case (II). In general, let € be any [n,k] cyclic code, with nonzeros aj, 
jE{Ca:i=1,..., s}. Suppose |C.,| =k, For each i= 1,..., s, let 


Ci-íu:I(u,v)€ C, for some v), 
let b; be the smallest number in Ci, n =|Ci] and 
K; = (v: I(b, v) € C4). 


(It may happen that C; — C; while K;# K; or vice-versa.) Let & be the binary 
cyclic code of length n, with nonzeros 8, j€ (Cii 1,..., s), and let 8, be 
the irreducible cyclic code over GF(2") of length n; with nonzeros y", u € Ki. 
The MS polynomial of a codeword of sf is 


F(Y)- 2 T.(&5Y^*), 


for & € GF(2"). Note that, if C:= Ci, this cyclotomic coset appears twice in 
the sum, once with & and once with £. The MS polynomial of a codeword of 
BZ; is 

@(Z) = oZ * + (GZ + (GZ, 


where u € Ki, c; € GF(2*). 


Theorem 8. (Kasami.) The MS polynomial of any codeword of © is obtained 
from F(Y) by replacing & by ®(Z), and setting X = YZ, Y" Z™=1. 


For example let € be the [63, 18] code with nonzeros a/, j E CiU CU Css. 
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Then n = 63, n,=7, n - 9, 
C,={Id, 1), 12, 2), 14, 4), I(1, 8), I(2, 7), 1(4, 5}, 
C; = (I(5, 5), IG, 1), I(6, 2), I(5, 4), 1G, 8), 1(6, D}, 
Cis = (I(1, 6), I(2, 3), I(4, 6), FC, 3), I(2, 6), I(4, 3)), 
Ci- Ci- (L2,4, b = bs = 1,71 = 15 = 3, 
C2 = {5, 3, 6}, b2 = 3, 5 = 3, 
K, = K, = {1, 8}, K3 = (6,3) 
B =a*, y =a”. 
sf has nonzeros f, j=0,...,6 and is the [7,6,2] even weight code. B, = B: 
has nonzeros y, y", check polynomial (x + y)(x + y*)=x’+a™%x+1, and d, 
has nonzeros yř, y* and check polynomial (x + yx + y5) 2 x? * x * 1. 
The MS polynomials of &, Bı, Bə, Bz are respectively 
F(Y) = T«&Y )- T(£Y 7) T&Y », 
Q(Z)-eZ'-dZ*, 
PAZ) = Z+ cZ, 
®(Z) = Z> + cz. 


After replacing & by ®,(Z) we obtain 
Thc: X + To(c2X 7) T«Gc4X ^), 


which is the MS polynomial of a codeword of €. 
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PART Ii: Other methods of combining codes 


We heve already seen a few ways of putting codes together to get new 
ones: 


(i) The direct sum of two codes (Problem 17 of Ch. 1, $9 of Ch. 2); 

(ii) The technique of pasting two codes side by side (Problem 17 of Ch. 1, 
Fig. 2.6); 

(iii) The |u|u * v| construction ($9 of Ch. 2); 

(iv) Concatenating two codes ($11 of Ch. 10); 


and of course the various product constructions described in Part I of this 
chapter. It is very easy to invent other constructions, and indeed the literature 
contains a large number (see the papers mentioned in the Notes). What makes 
the constructions given below special is that they have all been used to find 
exceptionally good codes. They are record holders. We usually speak of one 
code as being better than another if it contains more codewords (and has the 
same length and minimum distance), or alternatively if it has a greater length 
(and the same redundancy and minimum distance). This neglects all problems 
of encoding and decoding, but we justify this by arguing that once good 
codes, by this definition, have been found, decoding algorithms will follow. 
Append'» A contains a table giving the best codes presently known by length 
up to 512; many of these were found by the methods described here. 

We begin (in $7) by describing techniques for increasing the length of a 
code in an efficient way. The tail constructions X and X4 are given in $7.1 
and $72. Section 7.3 is a digression, giving a brief summary of what is known 
about the largest single- and double-error-correcting codes; even these com- 
paratively simple problems are unsolved. $7.4 describes the |a+x|b+x]a+ 
b * x| construction, important because it is one of the simplest ways to get 
the Golay code. Then a construction of Piret for doubling the length of an 
irreducible code is given in $7.5. 

$8 gives some constructions related to concatenated codes. In particular, 
$8.2 presents a powerful technique of Zinov'ev for generalizing concatenated 
codes. 

Finally, section 9 gives some clever methods for shortening a code. 


$87. Methods which increase the length 


$7.1. Construction X: adding tails to the codewords. This combines three 
codes to produce a fourth. Suppose €, and €, are (ni, M,, 4) and (n,, M; 
bM,, d.) codes, with the property that €; is the union of b disjoint translates 
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of €;: 
@2.=(X1+ €)U QT €)U---U(x, + €), (14) 


for suitable vectors xi, X2,..., Xp. 


Example. (6) Let €, be the (4, 2, 4) repetition code (0000, 1111}, and €; the 
(4,8,2) even weight code (0000, 1111, 0011, 1100, 0101, 1010, 1001, 0110}. 
Then b = 4 and 


€, C, U (0011 + €,)U (0101 + €) U (1001 + €). 
Let €,— (yi, ..., Ye} be any (ns, b, ds) code. In the example we could take 


€, to be the (3, 4, 2) code (000, 011, 101, 110}. Then the new code € is defined 
to be 


lxi * Ely Ulx+ €,| yo]: U lx+ Gil yo]. (15) 


In other words € consists of the vectors |x,+uly,| for u E€ €, i 1,..., b. 
Simply stated, €, is divided into cosets of €, and a different tail (y,) is 
attached to each coset- see Fig. 18.4. 


COSETS TAILS 
OF C, 
| d a 
( ds 





Fig. 18.4. Construction X: adding tails to @2. 


Theorem 9. The new code € has parameters 


(n; + n5, M2, min {d,, d; + ds). 


Proof. Let X — |x|y| and X'- |x'| y'| be distinct codewords of €. If x and x’ 
belong to the same coset of €, in €,, dist (X, X") = dist (x, x") = d,. If x and x’ 
belong to different cosets, then dist (x, x) = d; and dist (y, y) 2 d.. Q.E.D. 
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Example. (6) (cont.) Attaching the tails €, we find that € is the (7, 8, 4) linear 
code (0000000, 1111000, 0011011, 1100011, 0101101, 1010101, 1001110, 
0110110}. 


As this example shows, if €,, €; and €, are linear then we can arrange the 
x, and y, so that @ is linear. 

Construction X can be applied whenever nested codes are found. For 
example, the primitive BCH code of designed distance d; is a union of cosets 
of the BCH code of designed distance d, d; (86 of Ch. 7). So the con- 
struction applies to any pair of primitive BCH codes. E.g. if d, = d+ 2 and 
b = 2*, then €, can be taken to be the (k + 1,25, 2) code. In this case we are 
combining [n;, ki, di = d; * 2] and [n,, k + ki, d2] codes to obtain an [nı +k + 
Ll, k +k, d; 2] code. (This generalizes the Andryanov-Saskovets construc- 
tion [113, p. 333].) Some examples are 


€, € € 
[31, 6, 15] [31, 11, 11] [37, 11, 13] 
[63, 36, 11] [63, 39, 9] (67, 39, 11] 


Codes formed by applying construction X to BCH codes are indicated by XB 
in Fig. 2 of Appendix A. 

Other nested codes to which the construction may be applied are cyclic 
codes (indicated by XC); Preparata codes, $6 of Ch. 15 (XP); and the 
Delsarte-Goethals codes, Fig. 15.10 (DG). E.g. in view of Theorem 36 of Ch. 
15, we can use €, = 9(m)*, €.=a Hamming code, and obtain an infinite 
family of nonlinear 


Q"c-m-1,2"7" 5,5, m=4,6,8,... (16) 
codes. If m = 4 this is a (19, 2'', 5) code. 


Problems. (13) (a) Apply Construction X to the Y(m, d) codes ($5 of Ch. 15) 
to obtain (74, 27,28) and (74, 2", 32) codes. 

(b) Apply Construction X with €,- (1,6), €,-— X(6) to obtain a 
(70, 2", 30) code. 

(14) (a) Construction X3. Let the (ni, Mi = cM2= bcM,, ds) code €, be a 
union of c disjoint translates of the (ni, M; = bM,, d;) code €2, say 


«, = Ü (u; + €); 
i=l 
where ©, is the union of b disjoint translates of the (n,, Mi, di) code €,, say 
€,= Ü (vj + €). 
j= 


Each codeword of €, can be written uniquely as u; +v * c, c € €,. Also let 
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€$,— (y, ..., Yo} and @; = {z,,..., Zc} be (na, b, A) and (ns, c, 8) codes. Show 
that the code consisting of all vectors |u: + v,+c|y,|z;| has parameters 


(n, + n+ ns, Ms, d = min (d, d.+ A, d; + 5}). 


(b) For example, take €, = (255, 179, 21], €; = [255, 187, 19], €, = [255, 191, 
17] BCH codes, and €, = [9, 8, 2], €; = [8, 4, 4] codes to obtain a (272, 191, 21] 
code. 

(c) The following double tail construction has been successfully used by 
Kasahara et al. [721]. Let a be a primitive element of GF(2"), and define three 
BCH codes: 


Code Zeros Parameters 
Bı l,a,a’,...,a” (2" -—1,k,2i+2] 
Bı a”, a™,..., Qa” (2"—1,k —m,2i+ 4] 
B3 l,a,a’,...,a7*? (2"-1,k —m’,2i+4] 


where m’ is the degree of the minimal polynomial of a”*'. Now B, is the 
union of 2" cosets of Bı. So we can append a tail r(x) of length m to 
u(x) € B, such that vectors in the same coset have the same tail, but vectors 
in different cosets have different tails. Again B, is the union of 2” cosets of 
Bs, and we add a tail s(x) of length m’ in a similar way. Then the new code € 
consists of the vectors 


| u(x)| r(x)|s(x)|s()|. 
Show that @ has parameters 
[2" +m + m',k,2i +5]. (17) 


(d) Use this construction to obtain [22, 6,9] (m = 4) and [76, 50,9] (m = 6) 
codes. 


For other examples and generalizations see [721]. Codes obtained in this 
way are denoted by XQ in Fig. 2 of Appendix A. 


$7.2. Construction X4: combining four codes. Suppose we are given four 
codes: an (nı, Mi, d) code €,, an (ni, Mi = bM, d») code 62, an (ns, Ms, d3 = 
dı) code €,, and an (ns, M; = bMs, d4) code €,, with the properties that (i) €; 
is a union of b disjoint cosets of €: 


C2 = (x1 T €)U(x €)U---U(x + €), 
and (ii) €, is a union of b disjoint cosets of €: 
€, (nc €) U (yz €3) U -++U(ye + 63); 


for suitable vectors x,,...,x, and y,,..., ys. Then the new code € consists of 
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all the vectors 
jutulytov|], i=1,...,b, u E 8, v EG. (18) 


Simply stated, the vectors of the i" coset of €, in €; are appended to the 
vectors of the i" coset of €, in €, in all possible ways- see Fig. 18.5. 


NT 
eA RN 





Fig. 18.5. Construction X4, combining four codes. 


Theorem 10. The new code € has parameters 


(ni + n5, M;M,, d = min {dı, d; + d,}). (19) 
The proof is immediate. If ¢,-¢, are linear then € can be made linear. 


Example. (7) Take €, = the (15, 256, 5) Nordstrom- Robinson code (Lemma 22 
of Ch. 15), €, = the (15, 1280, 3) code consisting of €, together with any four 
of the eight cosets of €, in the [15, 11,3] Hamming code, €;- the (5, 2, 5) 
code (00000, 11111}, and €,- the (5, 10, 2) code (00000, 11111, 11000, 00111, 
10100, 01011, 10010, 01101, 10001, 01110}. Then € is a (20, 2560, 5) code. 


Further examples are given in $7.3 and in Fig. 2 of Appendix A (indicated 
by X4). 

Sugiyama et al. [1291] have successfully applied Constructions X and X4 
and modifications of them to Goppa codes. A large number of good codes 
found in this way are denoted by GP in Fig. 2 of Appendix A. 


Problem. (15) Apply Construction X4 to extend e-error-correcting BCH codes 
of length n,=2"—1. Let m ^ ae— B, where 0x B «e, and take ©, 62, €, 
equal to [n,, n, — em, 2e + 1], (ni, n; — (e — Dm,2e- 1] and (275,2? — ea — 1,2e + 
2] BCH codes. Take €, equal to €, plus 2"— 1 other cosets of €, in the 
(2°, 2* — 1,2] even weight code. Show that € is a [2" -2^ — 1,2" +27 — em — 
B —2,2e + 1] code. In the most favorable case, when m is divisible by e and 
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B=0, the number of information symbols in €, has been increased by 
2"* — 1 2 n"* at the cost of adding one check symbol. 


$7.3. Single- and Double-Error-Correcting Codes 


Single-error-correcting codes. The best binary linear single-error-correcting 
codes are known. They are Hamming and shortened Hamming codes, with 
parameters [n, n - m, 3], where 2"7' n x2" — I. 


Problem. (16) Prove this statement! 


For at least half the values of n there exist nonlinear codes with more 
codewords - see Theorem 34 of Ch. 2 and Fig. 2 of Appendix A. 


Research Problems (18.1). Determine A(n, 3) for n 2 9. (For n <8 see Fig. 1 of 
Appendix A.) 


Double-error-correcting (d.e.c.) codes. 

Much less is known. 

(18.2) What is the longest binary linear double-error-correcting code of 
redundancy r, for r= 4,5,...? If this length is denoted by n, (7), the following 
values are known (from Helgert and Stinaff [636]): 


r 456 7 8 9 10 ` 
nr): 5 68 11 17 23-29 32-42 > (20) 


Now let us study the best linear or nonlinear d.e.c. codes. We have already 
seen the following codes. 

(i) Primitive d.e.c. BCH codes with parameters [2" — 1,2" — 1—2m, 5]. 
However, these are always inferior to: 

(ii) Nonprimitive d.e.c. BCH codes with parameters [2", 2" — 2m, 5], ob- 
tained by shortening the code of length 2" + 1 and zeros a', iE CoU C,. 

(iii) The Preparata codes 9 (m)* with parameters (2" — 1, 27" ?", 5) for even 
m z 4. 

(iv) The extended Preparata codes (16). 

(v) The (11,24, 5) Hadamard code of §3 of Ch. 2, and the (20, 2560, 5) code 
of Example 7 above. 

The following d.e.c. codes are also known: 

(vi) [2" 4-24" *» — | 2" 4 2(m*971 — 2m — 2, 5] shortened BCH codes, given 
in Problem 28 below. 

(vii) A [74, 61, 5] shortened alternant code (Helgert [633]) and a [278, 261, 5] 
shortened generalized BCH code (Chien and Choy [286]). 

(viii) Wagner's [23, 14, 5] quasiperfect code ([1378]). 
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Research Problem (18.3). Give a simple construction of the latter code. 


Let n*(r) be the length of the longest binary (linear or nonlinear) d.e.c. 
code of redundancy r. Of course n*(r) 2 n,(r). From (ii) and (iii) we have 


n*(4a42)227*' and n*(4a43)22?*"— ]. (21) 


Problem. (17) Show n*(4a + 3) = 2^? — 1 [Hint: Corollary 14 of Ch. 17.] 


Construction X4 can be used to get further lower bounds on n*(r). Take 
€,= 9(m)*, €,= Hamming code Xm, C= the longest distance 6 code of 
redundancy m, of length n*(m — 1)4 1, and €, — an even weight code of the 
same length. The new code € has length 2" + n*(m — 1), redundancy 2m, and 
so 


n*(2m) 2 n*(m—1)+2", for even m > 4. (22) 


Thus the Preparata code A(m)* has been extended by the addition of 
~/(n + 1) information symbols at the cost of adding one check symbol. 


Problems. (18) Show that, for even m = 4, n*(2m + 1) z n*(m) 2". 
(19) Show n*(12) z 70, n*(16) = 271, n*(20) = 1047 and n*(21) = 1056. 


The best lower bounds on n*(r) known to us are shown in Fig. 2 of 
Appendix A (and Table II of [1237]. The only exact values known are 
n*(6) = 8, n*(7) = 15, and n*(4a +3) from Problem 17. Also n*(8) = 19 or 20. 


Research Problem (18.4). Determine n*(r). 


Even less is known about triple-error-correcting codes! 


$7.4. The |a * x|b  xla - b & x| Construction. Let €, and €, be (n, Mi, d) 
and (n, M;, d;) binary codes. The new code € consists of all vectors 


lat+x|b+xlat+b+x|, a,b E @, XE% (23) 


Clearly € is a code of length 3n containing M? M, codewords, and is linear if 
€; and €, are. No simple formula is known for the minimum distance of €, 
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although a lower bound is given by: 


Theorem 11. For any binary vectors a, b, x, 
wt|a 4 x|b 4 x|a t b * x| - 2wt (a.OR.b) — wt (x) + 4s, 
= 2wt(a.OR.b) — wt (x), (24) 


where s is the number of times a is 0, b is 0 and x is 1. 
Problem. (20) Prove Theorem 11. 


Theorem 12. (Turyn.) Let B, be the [7,4,3] Hamming code with nonzeros 
a', i € C,, and let B, be formed by reversing the codewords of B,. Let €, and 
€, be [8, 4, 4] codes obtained by adding overall parity checks to B, and B2. 
Then the code € given by (23) is the [24, 12, 8] Golay code Gu. 


Proof. Only the minimum distance needs to be checked. That this is the Golay 
code then follows from Theorem 14 of Ch. 20. Let u - |a * x|b * xla* b * x| 
be a nonzero codeword of €. Each of a, b, x has weight 0, 4 or 8. (i) If at most 
one of a, b, x has weight 4, wt (u) = 8. (ii) If two of a, b, x have weight 4, then 
wt (u) = 8. (For if wt (a) = 4, wt (x) = 4 then wt (a+ x) 22, etc.) (iil) Suppose 
wt (a) = wt (b) = wt (x) = 4. If a# b, wt (a.OR.b) 2 6 and Theorem 11 implies 
wt (u) 28. Hence in all cases wt (u) z- 8. Q.E.D. 


Problem. (21) Generalizing Theorem 12, let €, and €; be [2", m - 1,2" ] 
first-order Reed-Muller codes one of which is the reverse of the other (except 
for the overall parity check). Show that € has parameters 


[3:2",3m - 3,2"], m 73. 
(Hint: see [1237].] When m = 4 this is a [48, 15, 16] code. 


Research Problem (18.5). Find other applications of (23). Are there other 
constructions like (23) which give good codes? 


$7.5. Piret's construction. This is a technique for doubling the length of an 
irreducible code. Let € be an [n, k, d] irreducible cyclic binary code, with 
idempotent 8;(x) (see Ch. 8). Here k is the smallest integer such that n divides 
2* —]. Put N =(2"—1)/n, let £ be a primitive element of GF(2*), and let 
a£". Thus a"-— 1, and the nonzeros of € may be taken to be 
o,0?,a7?,..., oa". Also € consists of the codewords i(x)6,(x) (mod x" — 1), 
where deg i(x) « k. 
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We recall from Ch. 8 that € is isomorphic to GF(2*), with 
Element of € < Element of GF(2*) 


6i (x) e 1 
x6,(x) > a 
y(x)0(x) e é (this defines y(x)) 


Then any nonzero codeword of € can be written as 
y (xy 6x) e &, 
0j «2* —2. Let w; = wt(y(x)'G(x)). Then wjn = w; (For y(x)^"0,(x) e 
EN = a£! e xy(xy8((x).) We now choose the integer a so as to maximize 
d'= min QU + Wisa). (24a) 


Osja N 
The new code 9 consists of 0 and the vectors 
u; = | y(x)'0,(x)| y 09"*6:)], Q5) 


0<j «2* —2. Then @ is a [2n, k, d'] code where d' is given by (24a). 

Often it is possible to enlarge 2 by adding one or both of the generators 
11---100-++0 and 00:--011::: 1, and adding one or two overall parity 
checks. 

For example, let € be the [9,6,2] code with idempotent 6;(x) = x^ * x^. 
Here N — 7, £ is a primitive element of GF(2°) (see Fig. 4.5), a = £', y(x) = 
X^ t x* t xt +x", and (wo, Wi,..., wo) = (2, 6, 6, 4, 6, 4, 4). The best choice for a 
is 1, and @ is an [18,6,6] code which can be enlarged to a [20,8,6] code. 

Codes obtained in this way are indicated by the symbol PT in Fig. 2 of 
Appendix A. 


Problems. (22) Let € be a [17, 8, 6] code, with N = 15. Obtain [34,8, 14] and 
[35, 9, 14] codes. 
(23) Let € be a [21, 6,8] code, with N - 3. Obtain a [44,8, 18] code. 


Research Problem (18.6). Let € be a cyclic code of length n. How should a(x) 
be chosen so that the minimum distance of the code {| u(x)|a(x)u(x)(mod x” — 
1)|: u(x) € €) is as large as possible? 


$8. Constructions related to concatenated codes 


$8.1. A method for improving concatenated codes. Let €' be an [n', k', d'= 
n' — k' c 11 MDS code over GF(2*) (see Ch. 11), with codewords written as 
column vectors. If each symbol in a codeword is replaced by the cor- 
responding binary k-tuple (as in Example 2, §5 of Ch. 10), an n’Xk binary 
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; OVERALL PARITY 
" CHECK COLUMN 


ü 


Fig. 18.6. A typical codeword of €. 


array is obtained. Adding an overall parity check to each row gives an 
[n'(k + 1), kk’, 2d'] binary code € —see Fig. 18.6. (This is a simple form of 
concatenated code.) The minimum distance of € is at least 2d' because there 
are at least d’ nonzero rows, each having weight =2. 

Kasahara et al. [722] have pointed out that it is often possible to increase 
the dimension of € by 1. The new code Ð consists of the codewords of € 
together with the codewords formed by complementing the last column of 
Fig. 18.6. 


Theorem 13. The new code 9 has parameters 
[n'(k + 1), kk’ + 1,2d"], (26) 
provided n' > 2d’. 


Problem. (24) Prove this. 


For example, let €' be a [17, 11, 7] Reed-Solomon code over GF(2^).. Then 
9) is an [85, 45, 14] code. For further examples and a generalization see [722]. 
Codes obtained in this way are denoted by KS in Fig. 2 of Appendix A. 


Problem. (25) Construct [156, 76, 24] and [168, 76, 28] codes. 


*$8.2. Zinov'ev's generalized concatenated codes. This is a generalization of 
concatenated codes, and produces a number of good codes (denoted by ZV in 
Fig. 2 of Appendix A). The construction calls for the following codes. (1) A 
collection of r codes #,,..., %,, where £; is an (n, Ni, 8) code over GF(q,) 
and q; is a power of 2. (2) An (m, q,q2...q,, d)) binary code B® which is the 
union of q, disjoint codes 28fP(0 « i, = qı — 1), where Bt” is an (m, qo... Gr d3 
code. Each &{” must be the union of q disjoint codes 28í2(0 s i; = q; — 1), 


where Bf, has parameters (m, qs... q» ds), and so on. Finally, each Bine. 
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must be the union of q, disjoint codes B7? ;,.,(0 < i,-1 q,., —1), each with param- 
eters (m, q,, d,). Thus 


BO = U BP, BV = U BB... 
ij-0 i2=0 
and d,<d,<--:<d, A typical codeword of B® will be denoted by 
bin.. (OS i= q,-1,...,0Si<q,-1), with subscripts chosen so that 
bin... belongs to B and to B2, etc., and is the i," codeword of Bf,” ir 


The construction. Form an n X r array 


aj" a? Uf at? 
a a? af’ 
(27) 
at? a? Sh at 
where the first column is in 5£,, the second is in #2,.... Replace each row 
ai”, af?,. .., aj? by the binary vector 
bo ar sx. as” 


(To do this, label the elements of each field GF(q) by the numbers 
0,1,...,d; — 1 in some arbitrary but fixed way.) The resulting n x m binary 
arrays (considered as binary vectors of length nm) form the new code Z, 
called a generalized concatenated code. 


Theorem 14. (Zinov'ev [1470].) Z is an 
(nm, N,N;... No, d 2 min {8,d,,..., 6,d,}) 


binary code. (Of course & is not necessarily linear). 


Example. Take @ = (4, 16, 1) binary code. B® is the union of the (4, 8, 2) 
even weight code B® and the (4, 8, 2) odd weight code 931". Both BE and 98?" 
are the union of 4 translates of the (4,2, 4) code (0000, 1111}. Thus r=3, 
q: =2, q174, q5 - 2. We also take 

56, = (8, 2, 8) code over GF(2), 

5$; = (8, 4*, 4) code over GF(4) (generated by (43) of Ch. 1), 

£ = (8, 2", 2) code over GF(2). 


Then Z is a (32, 2^, 8) code. 
Proof of Theorem. Only the minimum distance is not obvious. Suppose (aj?) 


and (cj?) are two arrays (27), which first differ in the »™ column. Then they 
differ in at least 8, places in the »"^ column. By definition bj. ; ,... and 
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biu... us... (a € B) both belong to Bf,” ;_,, and so differ in at least d, places. 
Therefore the corresponding binary codewords of Æ differ in at least 6,d, 
places. Q.E.D. 


Problems. (26) Use r=2, q,=8, q; —2, B® = [8, 4, 4], B” = [8, 1, 8] binary 
codes, £, = [7, 2,6] code over GF(8), $4; = [7, 4, 3] code over GF(2) to obtain 
a [56, 10, 24] binary code X. 

(27) Use r=4, B® —[16,11,4], Bf —(16,2*,6), BP = [16, 5,8], BR= 
[16,1, 16], 534, = [6,1,6] over GF(8), £, = [6,3,4] over GF(8), £= [6, 4,3] 
over GF(16), s£, = [6, 5,2] over GF(2) to obtain a (96, 2", 24) binary code 2. 


89. Methods for shortening a code 


$9.1. Constructions Y I-Y4. Let € be an [n, k, d] binary code with parity check 
matrix H. If i information symbols are set to zero an [n — i, k — i, d] code is 
obtained. We can do better if we know the minimum distance d' of the dual 
code €". For then we can assume that the first row of H has weight d’. If we 
delete the columns of H where the first row is 1, the remaining matrix H, has 
a row of zeros and is the parity check matrix of a code €, of length n — d', 
dimension (n — d") - (n — k — 1) = k — d' + 1, and minimum distance atleast d (by 
Theorem 10 of Ch. 1). In fact €, consists of exactly those codewords of € 
which have zeros under the 1’s of the first row of H (with these zeros 
deleted). In short: 
[n, k, d] code, d’ = dist. of dual 
> [n — d’, k —- d' + 1, = d] code. (28) 


An example was given in §8 of Ch. 1, where this method was used to 
construct the Nordstrom-Robinson code from the Golay Code. Further 
examples are given in Hashim [623], Helgert and Stinaff [636, 637] and Sloane 
et al. [1237]. Codes obtained in this way are indicated by Y1 in Fig. 2 of 
Appendix A. 


Problems. (28) Take € to be a [2"*',27*' - 2m —3,6] extended BCH code. 
From Problem 10 of Ch. 15, the dual code @* has minimum distance 
d' 2" ..2*7*"97 Hence obtain a d.e.c. code of length 2" + 2*"7*9?7.. 1 and 
redundancy 2m +1. 

(b) The dual of the [48, 15, 16] code of Problem 21 has minimum distance 4. 
Obtain a [44, 12, 16] code. 

(29) Construction Y2. Suppose the first row of H is 11...100...0, of 
weight d', and let S consist of the codewords of € beginning with d' zeros. 
Thus €, is S with the first d’ coordinates deleted. Let T be the union of S and 
all of d' - 1 cosets of S in € with coset leaders 1107, 1010"7,...,1047210^77, 
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Show that if the first d' coordinates of T are deleted the result is an 
(n — d', d'2, >d — 2) (29) 
nonlinear code. For example, use the fact that the dual of the [31, 10, 12] BCH 


code has minimum distance d' = 5 to obtain a (26, 320, 10) code. 


Construction Y3. Similarly use all cosets with coset representatives of weight 


2 to obtain an 
(n -d', (1 + E ee >d- 4) (30) 


nonlinear code. For applications see [1237]. 


Problem. (30) Construction Y4. Let d” be the minimum weight of the vector u 
OR v, for u, v € €*, ux v. Show that deleting these columns gives an 


[(n—-d",k—-d"-2,z«d] 
code. Applications are given in [637]. 


$9.2. A construction of Helgert and Stinaff. Suppose € is an [n, k, d] code 
with k Xn generator matrix G. The first row of G can be assumed to be 


u = 1*0"7^, thus 
111100-0 
e-[ G, G: l 


Then Gz is the generator matrix of an [n— d, k — i, d;] code €, where 
dı > [d/2]. For let v be any nonzero codeword of €, of weight w, cor- 
responding to a codeword t of €. 


«A «—d-À—5 —wo 
t=11---1 00:0 11: 1 00-0 
u+t=00::: 0 1.)]1;:: 1 11:51 00:550 


Then A+w2d and d-A*w zd, hence 2w «d and w = [d/2]. In short 
(Helgert and Stinaff [636]) 


[i k, d] S [n-4.- J[2]] G1) 


For example, 
(24, 12, 8] Golay > [16, 11,4] Hamming, 
[51, 8, 24] cyclic (Ref. [266]) 2 [27, 7, 12]. 
Codes obtained in this way are denoted by HS in Fig. 2 of Appendix A. 
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Notes on Chapter 18 


81. General references on product codes and generalizations are Berlekamp 
and Justesen [127], Burton and Weldon [219], Elias [406], Forney [436], 
Goethals [490], Gore [546], Kasami [726, 731], Lee and Cheng [802] and Weng 
[1407]. See also Hartmann et al. [616-618]. Applications of product codes are 
given by Bahl and Chien [58], Chien and Ng [292], Goethals [489], Kuroda 
[787], Tang and Chien [1306], Weng and Sollman [1408] and Wolf and Elspas 
[1430]. 

For decoding product codes see: Abramson [4], Duc [389], Duc and 
Skattebol [391], Kasami and Lin [733]. Lin and Weldon [839], Reddy 
(1097, 1099], Reddy and Robinson [1100]. Wainberg [1381] and Weldon [1404]. 

Some applications use codewords which are rectangles (rather than vec- 
tors), and here product codes and their generalizations are very useful. The 
theory of two- and higher-dimensional codes is described by Calabro and 
Wolf [229], Gordon [541], Ikai and Kojima [681], MacWilliams and Sloane 
[885], Nomura et al. [998-1001], Reed and Stewart [11062] and Spann [1258]. 
For applications see Calingaert (230], Imai [682], Knowlton [771], Nakamura 
and {wadare [984], Patel and Hong [1029], Schroeder [1168], Sloane et al. 
[1230], [1234-1236] and especially [885]. 

Goldberg [517] and Rao and Reddy [1094] have found several good codes 
by taking the union of a product code and certain of its cosets. This is a 
promising technique which needs further study. 


Research Problem (18.7). Generalize the constructions of (517] and [1094]. 


82. For the theory of ideals in the group algebra 4 see Ikai and Kojima [681] 
and Nomura et al. [998-1001]. 


§7. The following papers give other techniques for constructing codes: Alltop 
[25], Assmus and Mattson [40], Bambah et al. [64], Bell [98], Blake [159], 
Bobrow et al. [168-170], Bredeson [195], Campopiano [240], Dagnino [326], 
Dénes and Keedwell [371, Ch. 10], Goppa [538], Hakimi et al. [575-577], 
Hashim and Constantinides [624], Hsaio et al. [668-670], Lecointe [800], 
Levitt and Wolf [828], Marchukov [912], Massey et al. [924], Olderogge 
[1010], Salzer [1142], Sauer [1147], Shiva [1198], Shrikhande [1206], Wallis 
[1385] and Wolf [1425, 1428]. 

Another class of codes, abelian group codes, show promise, but so far have 
not led to any really good codes. See Berman [135,136], Camion [238], 
Delsarte [345], and MacWilliams [877, 880]. 

A number of interesting computer searches have been made for good 
codes. See for example Chen [266], Fontaine and Peterson (433, 434], Hashim 
and Constantinides [625], Tokura et al. [1331], and Wagner [1377-1380]. 
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Constructions X and X4 are described by Sloane et al. [1237], which also 
gives encoding and decoding methods. See also Zaitsev et al. [1451] and 
Miyakawa et al. [965]. Theorem 12 is from [53]. Curtis [322] gives a different 
and very interesting proof of this theorem. Problem 13(b) is due to Ohno et al. 
[1009], Problems 14(a), (b) and 28(b) to Tezuka and Nakanishi [1315], and 
Problems 14(c), (d) to Kasahara et al. [721]. 


$7.5. For Piret’s construction see [1047], and also [1048-1050]. 


Self-dual codes and invariant 
theory 


§1. Introduction 


A linear code € is self-dual if € = €^ (88 of Ch. 1). We have seen that 
many good codes are self-dual, among them the extended Golay codes and 
certain quadratic residue codes. The most important property of self-dual 
codes is given by Gleason’s theorem (Theorem 3c), which imposes strong 
constraints on the weight enumerator. This theorem is proved by a powerful 
nineteenth-century technique known as invariant theory (§§2,3). The same 
technique proves numerous similar theorems, dealing with other kinds of 
weight enumerators (§4). One corollary of Gleason’s theorem is an upper 
bound on the minimum distance of a self-dual code (Theorems 13, 17). 
However (Corollary 16, Theorem 17) this upper bound can only be attained 
for small values of n. Nevertheless self-dual codes exist which meet the 
Gilbert-Varshamov bound (Theorems 21, 24). 

Most of the results in this chapter apply equally well to a larger class of 
codes, namely linear or nonlinear codes with the property that the transform 
of the distance distribution is equal to the distance distribution (e.g. the 
Nordstrom-Robinson code). Such codes are called formally self-dual. The 
formally self-dual codes encountered in this chapter have the same distance 
distribution as weight distribution, and so we work throughout with the weight 
distribution. 

We remind the reader that a self-dual code has even length n and 
dimension zn. 

The most interesting self-dual or formally self-dual codes have the pro- 
perty that all distances are multiples of some constant t >1. There are just 
four nontrivial cases when this can happen. 
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Theorem 1. (Gleason, Pierce and Turyn [53]) Suppose € is a formally 
self-dual code over GF(q) in which the distance between codewords is always 
a multiple of t1. Then either € is a trivial [n,5n, 2] code with Hamming 
weight enumerator 


((q — Dx^ y^, (1) 
for any q, or else one of the following holds: 
(i) q=2,t=2, 
(ii) q=2,t=4, 
(ili) q=3,t=3, 


The proof is omitted. Codes of type (ii) are sometimes called even self-dual 


codes. 
The following are some examples of self-dual codes. 


Codes of type (i). (E1) The binary code {00, 11} is the simplest self-dual code, 
and has weight enumerator (w.e.) 


Wi(x, y) 2 x! * y". (2) 


More generally the code over GF(q) with generator matrix [ab], a, b zx 0, is 
formally self-dual and has w.e. x?°+ (q— 1)y?. The direct sum of n/2 such 
codes is formally self-dual and has w.e. given by Equation (1). 

(E2) The (16, 256, 6) Nordstrom-Robinson code Mı. (88 of Ch. 2, Ch. 15). 

(E3) The [18, 9, 6] extended QR code ($6 of Ch. 16). 

(E4) The (8, 16, 2) formally self-dual code of Fig. 5.1. 


Codes of type (ii). (ES) The [8, 4, 4] Hamming code with w.e. 
Wax, y) = x*  1ax*y? 9 y". (3) 
(E6) The [24, 12,8] Golay code €, with w.e. 
W3(x, y) = x74 - 759x ^y* + 2576x ^ y" 
+ 759x*y ^ 4 y”. (4) 


More generally any extended QR code of length n=8m is of type (ii) 
(Theorems 7, 8 of Ch. 16). 

Recall from Ch. 5 that a code over GF(3) has two weight enumerators, the 
complete w.e. W(x, y, z) = E Aix'y'z*, where Aj, is the number of codewords 
containing i 0’s, j 1’s, and k 2's; and the Hamming w.e. W(x, y) = W(x, y, y). 


Codes of type (iii). (E7) The [4, 2, 3] ternary code #6 of Ch. 1, with Hamming 
w.e. 


W.(x, y) = x* + 8xy. (5) 
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(E8) The [12, 6, 6] ternary Golay code 4; (defined by (25) of Ch. 16), with 
w.e.'s 
W(x, y, Z) m x" y" Ez" -22(x5y5 + xz + yz?) 
+ 220(x*y?z? + xy z’ +x°y’z’), 
W.(x, y) = x" + 264x°y° + 440x^ y? + 24y". (6) 


More generally, any extended ternary QR or symmetry code is of type (iii) 
(by Equation (26) and Theorems 7, 8 and 17 of Ch. 16). 


Codes of type (iv). (E9) The [6,3,4] formally self-dual QR code over GF(4) 
(see p. 520 of Ch. 16). This is obtained by adding an overall parity check to the 
cyclic code with generator polynomial (x — B)(x — B^), where 8 € GF(16) is a 
fifth root of unity. À generator matrix is 


2 


1 
"E 
Í 
where w € GF(4), w°+ œw +1 =0. 


Problems. (i) Let € be a cyclic [2n — i,n] code. Find a necessary and 
sufficient condition on the zeros of € for the extended code €* to be 
self-dual. [Hint: study the cases n = 8, 16.] 

(2) Let q be even. Show that there exists an RS code € over GF(q) such 
that €* is self-dual. Show this is impossible if q is odd. 


Remark. It follows from Theorem 3c below that the [32, 16,8] QR and 
second-order RM codes have the same weight distribution. However, they are 
not equivalent because: (i) They have different groups. The group of the QR 
code is at least PSL;(31), of order 3: 31(3I/ — 1) = 14880, by Theorem 13 of 
Ch. 16. However, since 29 and 31 are twin primes, a theorem of Neumann 
[988, 989] implies that the group is no bigger than this. On the other hand, the 
group of the RM code is the general affine group GA(5), of order 
32:31-30:28:24:16- 319979520, from Theorem 24 of Ch. 13. (ii) The 
cosets have different weight distributions (Berlekamp and Welch [133]). (iii) 
They have different biweight enumerators (cf. Problem 36 of Ch. 5). 

Two inequivalent [14,3, 5] codes with the same weight distribution are 
given by Fontaine and Peterson ((434]). 


82. An introduction to invariant theory 


This section is an elementary introduction to invariant theory, showing 
how it is used to study weight enumerators. 
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Suppose € is a binary self-dual code with all weights divisible by 4, or an 
even self-dual code for short, and let W(x, y) be its weight enumerator. Since 
€ is self-dual, Theorem 1 of Ch. 5 implies 





1 
yn W(x-cy,x-v) 


E Xtyx- r) 
w( 2352 (7) 
(for W(x, y) is homogeneous of degree n). Since all weights are divisible by 4, 
W(x, y) only contains powers of y*. Therefore 

W(x, y) - W(x, iy), (8) 


where i= +/(—1). The problem we wish to solve is to find all polynomials 
W (x, y) satisfying (7) and (8). 


W(x, y)= 








Invariants. Equation (7) says that W(x, y) is unchanged, or invariant, under 
the linear transformation 
+ 
replace x by TA 
T: x-y 
replace y by PUDE 


or, in matrix notation, 
. x 1 fi 1) (3) 
T: replace (*) by va h 1] Y 


Similarly Equation (8) says that W(x, y) is also invariant under the trans- 
formation 


replace x by x 
Tz: 
replace y by iy 


f x 1 ?) (3) 
Ti replace (*) by (o d. 


Of course W(x, y) must therefore be invariant under any combination T7, 
T.T}, T.T-T.,... of these transformations. It is not difficult to show (as we 
shall see in $3) that the matrices 


vi e) m Q2) 


when multiplied together in all possible ways produce a group €, containing 
192 matrices. 


or 


600 Self-dual codes and Invariant theory Ch. 19. §2. 


So our problem now says: find the polynomials W(x, y) which are invariant 
under all 192 matrices in the group @. 


How many invariants? The first thing we want to know is how many 
invariants there are. This isn’t too precise, because of course if f and g are 
invariants, so is any constant multiple cf and also f + g, f — g and the product 
fg. Also it is enough to study the homogeneous invariants (in which all terms 
have the same degree). 

So the right question to ask is: how many linearly independent, homo- 
geneous invariants are there of each degree d? Let's call this number aa. 

A convenient way to handle the numbers ao, ai, @2,... is by combining 
them into a power series or generating function 


P(A)= ao+aiàÀ taa rs 


Conversely, if we know 4(A), the numbers aa can be recovered from the 
power series expansion of (A). 

At this point we invoke a beautiful theorem of T. Molien, published in 1897 
([971]; Bourbaki [190, p. 110], Burnside (211, p. 301], Miller et al. [955, p. 
259)). 


Theorem 2. For any finite group 4 of complex m x m matrices, (A) is given 
by 


P(A)= (9) 


iei PEETA det Te AA) 


where |G| is the number of matrices in G, det stands for determinant, and I is a 
unit matrix. In words, (à) is the average, taken over all matrices A in the 
group, of the reciprocal of the polynomial det (I — AA). 


We call $(X) the Molien series of 4. The proof of this theorem is given in 


83. 
For our group &,, from the matrices corresponding to I, T;, T2,... we get 


a 1 1 e" AINE TE 
iom ula G=ay 1-a Gd ed. } (10) 


There are shortcuts, but it is quite feasible to work out the 192 terms directly 
(many are the same) and add them.: The result is a surprise: everything 
collapses to give 


1 


ü-A59ü-A^*y (11) 


$(A)-— 


Interpreting (A). The very simple form of (11) is trying to tell us something. 
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Expanding in powers of A, we have 


P(A)= Qo Q1À +a’ +: es 
=(1 AČA +A” + HAHAE (12) 


We can deduce one fact immediately: aa is zero unless d is a multiple of 8. 
I.e., the degree of a homogeneous invariant must be a multiple of 8. (This 
already proves that the block length of an even self-dual code is multiple of 
8.) But we can say more. The RHS of (12) is exactly what we would find if 
there were two “basic” invariants, of degrees 8 and 24, such that all invariants 
are formed from sums and products of them. 

This is because two invariants, 0, of degree 8, and q, of degree 24, would 
give rise to the following invariants. 


degreed invariants number aa 


0 1 1 
8 6 1 
16 9? 1 
24 0o 2 
32 0*, 09 2 
40 0°, Oe 2 
48 0^, 0o, q? 3 


(13) 


Provided all the products @‘g! are linearly independent - which is the same 
thing as saying that @ and ¢ are algebraically independent - the numbers a, in 
(13) are exactly the coefficients in ` 


HATHA + 2A" 4+ 247 - 24 € 3A 
-(IcA*-AU AUI AU AUS) 
1 


= a a5 2a (14) 


which agrees with (11). So if we can find two algebraically independent 
invariants of degrees 8 and 24, we will have solved our problem. The answer 
will be that any invariant of this group is a polynomial in 0 and ¢. Now 
W(x, y) and Ws(x, y), the weight enumerators of the Hamming and Golay 
codes, have degrees 8 and 24 and are invariant under the group. So we can 
take 0 = W(x, y) and q = Wi(x, y). (It’s not difficult to verify that they are 
algebraically independent.) Actually, it is easier to work with 


W. , ?— W. , 4.4; L4 444 
e = Wi, y) = 22083) T3053) —. yg yt (15) 


rather than Wi(x, y) itself. So we have proved the following theorem, 
discovered by Gleason in 1970. 
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Theorem 3a. Any invariant of the group €, is a polynomial in W(x, y) 
(Equation (3)) and W3(x, y) (Equation (15)). 


This also gives us the solution to our original problem: 


Theorem 3b. Any polynomial which satisfies Equations (7) and (8) is a 
polynomial in W(x, y) and W(x, y). 


Finally, we have characterized the weight enumerator of an even self-dual 
code. 


Theorem 3c. (Gleason [486]. The weight enumerator of any even self-dual 
code is a polynomial in W(x, y) and Wi(x, y). 


Alternative proofs of this theorem are given by Berlekamp et al. [129], 
Broué and Enguehard [201], and Feit [424] (see also Assmus and Mattson 
[47]. But the proof given here seems to be the most informative, and the 
easiest to understand and to generalize. 

Notice how the exponents 8 and 24 in the denominator of (11) led us to 
guess the degrees of the basic invariants. 

This behavior is typical, and is what makes the technique exciting to use. 
One starts with a group of matrices £, computes the complicated-looking sum 
shown in Equation (9), and simplifies the result. Everything miraculously 
collapses, leaving a final expression resembling Equation (11) (although not 
always quite so simple — the precise form of the final expression is given in 
83). This expression then tells the degrees of the basic invariants to look for. 


Problem. (3) (Gleason [486; 129, 883].) Use the above technique to show that 
the weight enumerator of any binary self-dual code is a polynomial in 
W(x, y)= x? 4 y? and 


Wilx, y)* Wax, y) 22 


W(x, y) = 3 yx? - yy. (16) 


[Hint: By problem 38 of Ch. 1, all weights are divisible by 2. The group is 
generated by T, and 

(cen 

0 —1/’ 


has order 16 and Molien series 1/(1—A’)(1 — A‘).] 


Finding the basic invariants. In general, finding the basic invariants 
simpler problem than finding (A). Either one can use the weight enumer 
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of codes having the appropriate properties, as in the above example, or basic 
invariants can be found by averaging, using the following simple result 
(proved in §3). 


Theorem 4. If f(x) = f(x:,...,Xm) is any polynomial in m variables, and Gis a 
finite group of m x m matrices, then 


Fi) = 2A fa) 


is an invariant, where A ° f(x) denotes the polynomial obtained by applying 
the transformation A to the variables in f. 


Of course f(x) may be zero. An example of the use of this theorem is 
given below. 


An application of Theorem 3. To show the power of Theorem 3c, we use it to 
find the w.e. of the [48, 24] QR code. From Theorems 7, 8, 25 of Ch. 16, this is 
an even self-dual code with minimum distance 12. Therefore the weight 
enumerator of the code, which is a homogeneous polynomial of degree 48, has 
the form 


W(x, y) =x + Aux Sy"? ters, (17) 


The coefficients of x^y, x'5y?,... , x "y" are zero. Here Av is the unknown 


number of codewords of weight 12. It is remarkable that, once we know 
Equation (17), the weight enumerator is completely determined by Theorem 
3c. For Theorem 3c says that W(x, y) must be a polynomial in W;(x, y) and 
W(x, y). Since W(x, y) is homogeneous of degree 48, Wz is homogeneous of 
degree 8, and Wi is homogeneous of degree 24, this polynomial must be a 
linear combination of W:, WW: and WY. 

Thus Theorem 3c says that 


W(x, y) = a4Wi + a»W1W 5r aW? (18) 
for some real numbers a», @:, a2. Expanding Equation (18) we have 
W(x, y) = adx” + 84x" y* + 2946x ?y* +--+) 
t aix" y* 38x“y? +--+) 
+ ax(x“y*—+++), (19) 
and equating coefficients in Equations (17), (19) we get 
a= l, a,= — 84, à; = 246. 


Therefore W(x, y) is uniquely: determined. When the values of ao, ai, a; are 
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substituted in (18) it is found that 


W (x, y) = x“ + 17296x 5y? + 535095x ? y'5 
+ 3995376x y” +7681 680x^*y 2 3995376x” y” 
+ 535095x “y” + 17296x 12436 +y” (20) 


Since the minimum distance is as large as'it can be, this is called an extremal 
weight enumerator- see §5. Direct calculation of this weight enumerator 
would require finding the weight of each of the 2% = 1.7 x 10’ codewords, a 


respectable job even for a computer. 

Of course there is also a fair amount of algebra involved in the invariant 
theory method, although in the preceding example it can be done by hand. 
The reader may find it helpful if we give a second example, in which the 
algebra can be shown in full. 


A very simple example. The weight enumerator of a self-dual code over GF(q) 
by Theorem 13 of Ch. 5 satisfies the equation 


xt(q-— Dy x x)= 
wi 
va "Va 
Problem: find all polynomials which satisfy Equation (21). 
The solution proceeds as before. Equation (21) says that W(x, y) must be 
invariant under the transformation 


Ts: replace (3) by A (*), 
3: p y y y 


W(x, y). (21) 


where 
uud fl gs 1) 
VT ( i TM Q2) 
Now A’= I, so W(x, y) must be invariant under the group @ consisting of the 


two matrices I and A. 
To find how many invariants there are, we compute the Molien series (A) 


from Equation (9). We find 
det (I -aAI)=(1—Ay’, 


1-A .4-1 

det (I — AA) = det “a 1 -z1-A*, 
CERNI ed SS 
vq va 


swiat e) 
B l 
~ (10-7 AX1—A9y Q3) 
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which is even simpler than Equation (11). Equation (23) suggests that there 
might be two basic invariants, of degrees 1 and 2 (the exponents in the 
denominator). If algebraically independent invariants of degrees 1 and 2 can 
be found, say g and h, then Equation (23) implies that any invariant of €, is a 
polynomial in g and h. 

This time we shall use the method of averaging to find the basic invariants. 
Let us average x over the group - ie., apply Theorem 4 with f(x, y)= x. The 
matrix I leaves x unchanged, of course, and the matrix A transforms x into 
(1/A/q)(x + (q — 1)y). Therefore the average, 


f(x, y)= [tieta = 3] 


-~ (Vat Dix + G/q - Dy} 
2Vq : 
is an invariant. Of course any scalar multiple of f(x, y) is also an invariant, so we 
may divide by (vq + 1)/2 /q and take 


g=xt+(Vq-Dy (24) 


to be the basic invariant of degree 1. To get an invariant of degree 2 we 
average x? over the group, obtaining 


s[e+te +(q- nr]. 


This can be cleaned up by subtracting ((q + 1)/2q)g” (which of course is an 
invariant), and dividing by a suitable constant. The result is 


h = y(x- y) 


the desired basic invariant of degree 2. 
Finally g and h must be shown to be algebraically independent: it must be 
shown that no sum of the form 


> cyg'h!, c, complex and not all zero, (25) 
ij 


is identially zero when expanded in powers of x and y. This can be seen by 
looking at the leading terms. (The leading term of a polynomial is the first one 
to be written down when using the natural ordering illustrated in Equations 
(15), (20), (24).) Thus the leading term of g is x, the leading term of h is xy, 
and the leading term of g'h! is x'*'y’, Since distinct summands in Equation (25) 
have distinct leading terms, (25) can only add to zero if all the c, are zero. 
Therefore g and h re algebraically independent. So we have proved: 


Theorem 5. Any invariant of the group @, or equivalently any polynomial 
satisfying (21), or equivalently. the weight enumerator of any self-dual code 
over GF(q), is a polynomial in g=x+(/q- 1)y and h = y(x — y). 
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At this point the reader should cry Stop!, and point out that a self-dual 
code must have even length and so every term in the weight enumerator must 
have even degree. But in Theorem 5 g has degree 1. 

Thus we haven't made use of everything we know about the code. W(x, y) 
must also be invariant under the transformation 


x x 
eplace (*) b B(*), 
ne y 7 y 


_{-1 °) a 
B= ( 0-1)" I. 
This rules out terms of odd degree. So W(x, y) is invariant under the group 9, 
generated by A and B, which consists of 
I, A, -I, -A. 


The reader can easily work out that the new Molien series is 


where 


Bala) = i (44) Dal- A) 


Meis reni] 
241—AXYX1—A)) (1*AX1—A7) 
l 
EV 


There are now two basic invariants, both of degree 2 (matching the exponents 
in the denominator of (26), say g^ and h, or the equivalent and slightly 
simpler pair g* = x^ (q — 1)y? and h = y(x — y). Hence: 


Q6) 


Theorem 6. The weight enumerator of any self-dual code over GF(q) is a 
polynomial in g* and h. 


The general plan of attack. As these examples have illustrated, there are two 
stages in using invariant theory to solve a problem. 


Stage I. Convert the assumptions about the problem (e.g. the code) into 
algebraic constraints on polynomials (e.g. weight enumerators). 


Stage II. Use the invariant theory to find all possible polynomials satisfying 
these constraints. 
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§3. The basic theorems of invariant theory 


Groups of matrices. Given a collection A;,..., A, of m x m invertible matrices 
we can form a group @ from them by multiplying them together in all possible 
ways. Thus € contains the matrices I, Ai, A»... , AiÁs ..., AAT A: As... 
We say that 4 is generated by Ai,..., An Of course € may be infinite, in 
which case the theory of invariants described here doesn't directly apply. (But 
see Dieudonné and Carrell [375], Rallis [1087] and Weyl [1410].) 


Example. Let us show that the group 4, generated by the matrices 


Moos ei) and 12i) 


that was encountered in §2 does indeed have order 192. The key is to discover 
(by randomly multiplying matrices together) that 4, contains 


ra() 9), eom tei QD. 


1 v2 M01 
ae 1o) ee z (05 
E i. R - MI'M = (15 


So 4, contains the matrices 
1 °) ( 0 1) er 
a(t) aap gh aetii-1-3. 


which form a subgroup #, of order 16. From this it is easy to see that. 4, 
consists of the union of 12 cosets of 3€: 


=U a, Q7) 
where @,,..., a. are respectively 
6 ik Du ism ah aah 1) 
0 1/0 IP V2 -1Py2NV  —-i/PNV2M -iÓiPy2N 1? 
4; = 20;,,...,0:27 Nas, and N =(1+i)/ V2. Thus €, consists of the 192 
matrices 
,{1 0 aby i) "A E) 


for 0<v <7 and a, B c(l, i —1,— i). 

As a check, one verifies that every matrix in (28) can be written as a 
product of M’s and J’s; that the product of two matrices in (28) is again in 
(28); and that the inverse of every matrix in (28) is in (28). Therefore (28) is a 
group, and is the group generated by M and J. Thus €, is indeed equal to (28). 
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We have gone into this example in some detail to emphasize that it is 
important to begin by understanding the group thoroughly. (For an alternative 
way of studying €, see [201; pp. 160—161]. 


Invariants. To quote Hermann Weyl [1409], “the theory of invariants came 
into existence about the middle of the nineteenth century somewhat like 
Minerva: a grown-up virgin, mailed in the shining armor of algebra, she 
sprang forth from Cayley’s Jovian head." Invariant theory become one of the 
main branches of nineteenth century mathematics, but dropped out of fashion 
after Hilbert’s work: see Fisher [431] and Reid [1108]. Recently, however, 
there has been a resurgence of interest, with applications in algebraic 
geometry (Dieudonné and Carrell [375], Mumford [976]), physics (see for 
example Agrawala and Belinfante [7] and the references given there), com- 
binatorics (Doubilet et al. [381], Rota [1125]), and coding theory 
([883, 894, 895}). 

There are several different kinds of invariants, but here an invariant is 
defined as follows. 

Let 4 be a group of g m x m complex matrices A:,..., A, where the 
(i, k)" entry of A, is ax’. In other words € is a group of linear transformations 
on the variables xi;,..., x», consisting of the transformations 


(a) (a) 


T^: replace x; by x= M ax, i=1,...,m (29) 
k=! 


for a=1, 2,...,g. It is worthwhile giving a careful description of how a 
polynomial f(x) = f(xi,...,x«) is transformed by a matrix A. in G The 
transformed polynomial is 


Aa © f(x) = f(x... xm) 


where each x^! is replaced by Ez. af@’x,. Another way of describing this is to 
think of x 2 (xi,..., Xm)” as a column vector. Then f(x) is transformed into 


Aa ° f(x) = f (Aux), (30) 
where A.x is the usual product of a matrix and a vector. One can check that 


B * (A ° f(x)) = (AB) » f(x) = f(ABx). (31) 


aT 


transforms xi+ x; into (X: + 2x2} — x2. 


For example, 


Definition. An invariant of € is a polynomial f(x) which is unchanged by 
every linear transformation in 9. In other words, f(x) is an invariant of @ if 
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Aa ° f(x) = f(A.x) = f(x) 


for all a = 1,...,g. 


Example. Let 


s= (5). C o - 9) 


a group of order g = 2. Then x”, xy and y? are homogeneous invariants of 


degree 2. 
Even if f(x) isn't an invariant, its average over the group is, as was 
mentioned in $2. 


Theorem 4. Let f(x) be any polynomial. Then 
1 
Jo) =7 2, A. fo) (32) 


is an invariant of $. 


Proof. Any As € 4 transforms the right-hand side of (32) into 
È (A.As) * f(x). by GI). (33) 
As A, runs through €, so does A.As, if Ag is fixed. Therefore (33) is equal to 
FÈ AS fG) 


which is f(x). Therefore f(x) is an invariant. Q.E.D. 


More generally, any symmetric function of the g polynomials A: ° 
f(x), ..., A, ° f(x) is an invariant of $. 

Clearly if f (x) and h(x) are invariants of €, so are f(x) + h(x), f (x)h (x), and 
cf (x) (c complex). This is equivalent to saying that the set of invariants of &, 
which we denote by $(%), forms a ring (see p. 189). 

One of the main problems of invariant theory is to describe $(¥). Since the 
transformations in 4 don't change the degree of a polynomial, it is enough to 
describe the homogeneous invariants (for any invariant is a sum of homo- 
geneous invariants). 


Basic invariants. Our goal is to find a "basis" for the invariants of 4, that is, a 
set of basic invariants such that any invariant can be expressed in terms of 
this set. There are two different types of bases one might look for. 
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Definition. Polynomials f,(x),...,f,(x) are called algebraically dependent if 
there is a polynomial p in r variables with complex coefficients, not all zero, 
such that p(fi(x),...,f. (x) 9 0. Otherwise fi(x),...,f.(x) are called alge- 
braically independent. A fundamental result from algebra is: 


Theorem 7. (Jacobson [687], vol. 3, p. 154.) Any m * 1 polynomials in m 
variables are algebraically dependent. 


The first type of basis we might look for is a set of m algebraically 
independent invariants f,(x),..., fm(x). Such a set is indeed a "basis," for by 
Theorem 7 any invariant is algebraically dependent on f.,...,fm and so isa 
root of a polynomial equation in fi,..., fm. The following theorem guarantees 
the existence of such a basis. 


Theorem 8. (Burnside (211, p. 357].) There always exist m algebraically 
independent invariants of @. 


Proof. Consider the polynomial 


g 

I] (t — Aa ° xi) 
in the variables f, x1,...,Xm. Since one of the A, is the identity matrix, t = x, 
is a zero of this polynomial. When the polynomial is expanded in powers of t, 
the coefficients are invariants by the remark immediately following the proof 
of Theorem 4. Therefore x, is an algebraic function of invariants. Similarly 
each of x2, ..., Xm is an algebraic function of invariants. Now if the number of 
algebraically independent invariants were m' (« m), the m independent 
variables x1,...,Xm would be algebraic functions of the m' invariants, a 
contradiction. Therefore the number of algebraically independent invariants is 
at least m. By Theorem 7 this number cannot be greater than m. Q.E.D. 


Example. For the preceding group a, we may take f;— (x * yY and f;- 
(x — yY as the algebraically independent invariants. Then any invariant is a 
root of a polynomial equation in f, and f;. For example, 


x! Nf NI BY, 
xy = fi fr), 
and so on. 
However, by far the most convenient description of the invariants is a set 


fs... fi of invariants with the property that any invariant is a polynomial in 
fy... f. Then fi,..., fi is called a polynomial basis (or an integrity basis) for 
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the invariants of 4. Of course if | ^ m then by Theorem 7 there will be 
polynomial equations, called syzygies, relating fi... , fi. 

For example, fi — x^, f;— xy, fs=y’ form a polynomial basis for the 
invariants of G,. The syzygy relating them is 


fifs- fá- 0. 


The existence of a polynomial basis, and a method of finding it, is given by 
the next theorem. 


Theorem 9. (Noether [997]; see also Weyl [1410, p. 275].) The ring of 
invariants of a finite group 4 of complex m x m matrices has a polynomial 
basis consisting of not more than ("5*) invariants, of degree not exceeding g, 
where g is the order of G. Furthermore this basis may be obtained by taking the 
average over 4 of all monomials 

xbxb: An xem 


of total degree X b; not exceeding g. 


Proof. Let the group consist of the transformations (29). Suppose 
f(x... Xm) = Dy CXP XI, 
c. complex, is any invariant of €. (The sum extends over all e = e, < + * em for 


which there is a nonzero term xi': +: xz in f(X1,...,Xm).) Since f(x1,...,Xm) 
is an invariant, it is unchanged when we average it over the group, so ` 


f(x... x) =at, ee X) HF, xt 
=: Ce {(xi?)" UM (x) quei + (x(£)^ x42 (xi?) 
- 5 2 cJ. (say). 


Every invariant is therefore a linear combination of the (infinitely many) 
special invariants 


J= È ty XY 
a=! 
Now J. is (apart from a constant factor) the coefficient of ut'- -- usr in 


P. = > Qux? + + XY, (34) 


where e=e,+--:-+em. In other words the P. are the power sums of the g 
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quantities 


UXO FX MAXI te bX. 


Any power sum P, e = 1,2,..., can be written as a polynomial with rational 
coefficients in the first g power sums P, P2,...,P, (Problem 3). Therefore 
any J. for 


e- Sag 
i=} 


(which is a coefficient of P.) can be written as a polynomial in the special 
invariants 


Je with e:+:::ten Sg 


(which are the coefficients of P.,...,P,). Thus any invariant can be written 
as a polynomial in the J. with Z7: e, € g. The number of such J. is the number 
of ei, 6€5...,€4 with e 20 and ei*t----«* e, Sg, which is (";*). Finally 
deg J. € g, and J. is obtained by averaging xî'--- xz7 over the group. 

Q.E.D. 


Molien's theorem. Since we know from Theorem 9 that a polynomial basis 
always exists, we can go ahead with confidence and try to find it, using the 
methods described in $2. To discover when a basis has been found, we use 
Molien's theorem (Theorem 2 above). This states that if az is the number of 
linearly independent homogeneous invariants of 4 with degree d, and 


$4(A) = > aa, 
d=0 
then 


z 1 
qaum do AA«) (35) 


The proof depends on the following theorem. 


Theorem 10. (Miller et al. [955, p. 258], Serre [1185, p. 29].) The number of 
linearly independent invariants of 4 of degree | is 


a= trace (Aa). 


Proof. Let 
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Changing the variables on which @ acts from X1,...,Xm tO yi,..., Ym, Where 
(yn... Ym) =(X1,...,Xm)T", changes S to S'= TST '. We may choose T so 
that S’ is diagonal (see Burnside [211, p. 252]). Now S° = S, (S'Y = S', hence 
the diagonal entries of S' are 0 or 1. So with a change of variables we may 
assume 


0 0d, 
with say r l's on the diagonal. Thus S» y= y if 1<i<r, So y,=0 if 
r+1sism. 


Any linear invariant of @ is certainly fixed by S, so a, Sr. On the other 
hand, by Theorem 4, 


S LS A 
»7g a yi 


a-l 
is an invariant of 4 for any i, and so aiz r. Q.E.D. 
Before proving Theorem 2 let us introduce some more notation. Equation 
(29) describes how A, transforms the variables xi,..., x». The d™ induced 


matrix, denoted by A'?, describes how A, transforms the products of the x, 
taken d at a time, namely x1, x$, . . . , X? 'X2,... (Littlewood (857, p. 122]). E.g. 


transforms xi, xix? and xi into 


a!xi + 2abxixi* b?xi, 
acxi + (ad + bc)xixa  bdxi, 
c?xt + 2cdxixi + d?xi 


respectively. Thus the 2” induced matrix is 


a? 2ab b! 
A=! ac ad * bc bd |. 
c? 2cd d! 


Proof of Theorem 2. To prove Equation (35), note that a4, is equal to the 
number of linearly independent invariants of degree 1 of $'"-— (AU: a= 
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1,...,g . By Theorem 10, 


1 g 

= 2 trace AY, 

Ba = 

Therefore to prove Theorem 2 it is enough to show that the trace of A‘! is 
equal to the coefficient of A^ in 


1 —— eee 
det(I-AA.) (1—Ào)--- (1- Ao) 39 


where w;,...,@m are the eigenvalues of Aa. By a suitable change of variables 
we can make 


wit 0 
2 
Q1 i 0 (d 
A. = AMI = ` d-1 
0 j (d w (2 
0 - 
and trace Al?! = sum of the products of w:,...,@m taken d at a time. But this 
is exactly the coefficient of A“ in the expansion of (36). Q.E.D. 


It is worth remarking that the Molien series does not determine the group. 
For example there are two groups of 2x2 matrices with order 8 having 


1 
$0)7q0-1530-45 
(namely the dihedral group 2, and the abelian group Æ, x &,). In fact there 
exist abstract groups # and B whose matrix representations can be paired in 
such a way that every representation of % has the same Molien series as the 
corresponding representation of B (Dade [324]). 


A standard form for the basic invariants. The following notation is very 
useful in describing the ring ($) of invariants of a group $. The complex 
numbers are denoted by C, and if p(x), q(x),... are polynomials C[p(x), 
q(x),...] denotes the set of all polynomials in p(x), q(x) with complex 
coefficients. For example Theorem 3a just says that $(%,) = C[0, ¢]. 

Also @ will denote the usual direct sum operation. For example a state- 
ment like $(9)=R@S means that every invariant of 4 can be written 
uniquely in the form r+s where r€ R, s € S. (Theorem 12 below illustrates 
this.) 

Using this notation we can now specify the most convenient form of 
polynomial basis for $(4). 
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Definition. A good polynomial basis for (8) consists of homogeneous 
invariants fı, ..., f(| > m) where f,,...,fm are algebraically independent and 


A(M=Clfi,..., fm] ifl=m, (37a) 
or, if [7 m, 


ICO) 7 Cf... f fmC lfi,- fO: Of Clfu.... fal (387b) 


In words, this says that any invariant of 4 can be written as a polynomial 
in fi,...,fm (if L = m), or as such a polynomial plus fm+:ı times another such 
polynomial plus : : - (if 1 > m). Speaking loosely, this says that, to describe an 
arbitrary invariant, f,,...,fm are "free" invariants and can be used as often 
as needed, while fms:,...,f; are "transient" invariants and can each be used 
at most once. 

For a good polynomial basis f,,..., fi we can say exactly what the syzygies 
are. If | - m there are no syzygies. If ¿>m there are (1 — my. syzygies 
expressing the products ff; (i z m, j =m) in terms of fi,..., fi. 

It is important to note that the Molien series can be written down by 
inspection from the degrees of a good polynomial basis. Let d,= 
deg fı, ..., d; = deg fi. Then 


1 "mz 
Palà) = maay if | — m, (38a) 
or 
OIEA’ 
P(A) = MENETE (1 A*)' if 12 m. (38b) 


(This is easily verified by expanding (38a) and (38b) in powers of A and 
comparing with (37b).) 
Some examples will make this clear. 


(1) For the group 4, of 82, fi- W(x, y) and f.= Wi(x, y) form a good 
polynomial basis, with degrees d, = 8, d; = 24. Indeed, from Theorem 3a and 
Equation (11), 


IG) = C[W(x, y), Ws(x, y)] 


and 


ix 1 
Palà) = ad sure = v4)" 


(2) For the group €, defined above, f,=x’, f2=y’, fs=xy is a good 
polynomial basis, with d; = d;— d, = 2. The invariants can be described as 


5($) = C[x', y'] O xy C[x', y’). (39) 
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In words, any invariant can be written uniquely as a polynomial in x? and y? 
plus xy times another such polynomial. E.g. 


(x + yY = (y + 6x'! y? + (y?)y + xy(4Ax! + Ay?). 


The Molien series is 


df 1 1 
997 {aay aaa 

2 1+? 

"üca* 


in agreement with (38b) and (39). The single syzygy is x? - y? = (xy). Note that 
fi x, fa^ xy, fs» y! is not a good polynomial basis, for the invariant y is 
not in the set C[x?, xy] O y’C[x?, xy]. 

Fortunately the following result holds. 


Theorem 11. (Hochster and Eagon [657, Proposition 13]; independnetly 
proved by Dade [325].) A good polynomial basis exists for the invariants of 
any finite group of complex m X m matrices. 


(The proof is too complicated to give here.) 

So we know that for any group the Molien series can be put into the 
standard form of Equations (38a), (38b) (with denominator consisting of a 
product of m factors (1 — A^) and numerator consisting of sum of powers of A 
with positive coefficients); and that a good polynomial basis Equations (37a), 
(37b) can be found whose degrees match the powers of A occurring in the 
Molien series. 

On the other hand the converse is not true. It is not always true that when 
the Molien series has been put into the form (38a), (38b) (by cancelling 
common factors and multiplying top and bottom by new factors), then a good 
polynomial basis for ($) can be found whose degrees match the powers of A 
in (A). This is shown by the following example, due to Stanley [1262]. 

Let € be the group of order 8 generated by the matrices 


-1 0 0 100 
0-1 Of and |010}. 
0 0-1 00i 


940) = qi (40) 


_ 1+? 
"(1-A?j(1— ASY 


The Molien series is 


(41) 


A good polynomial basis exists corresponding to Equation (41), namely 
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I( Ge) = C[x^, y’, z^] @ xy C[x', y^, z^], 


but there is no good polynomial basis corresponding to (40). 


Research Problem (19.1). Which forms of (A) correspond to a good 
polynomial basis and which do not? 


One important special case has been solved. Shephard and Todd [1196] 
have characterized those groups for which (37a) and (38a) hold, i.e., for which 
a good polynomial basis exists consisting only of algebraically independent 
invariants. These are the groups known as unitary groups generated by 
reflections. A complete list of the 37 irreducible groups of this type is given in 
[1196]. 


Problem. (4) Let 


P.=S y, e=0,1,2,..., 
i=} 
where y.,...,y, are indeterminates. Show that P. is a polynomial in 
P,,...,P, with rational coefficients. (Hint: Problem 52 of Ch. 8; see also 
[756]]. 


*$4. Generalizations of Gleason's theorem 


All of the genéralized weight enumerators of self-dual codes can be 
characterized by invariant theory. We work out one further example in detail, 
illustrating the general plan of attack described in $2 in a situation where it is 
more difficult to find a good polynomial basis. Several other generalizations 
are given as problems. 


The complete weight enumerator of a ternary self-dual code. Let € be an 
[n, 3n, d] ternary self-dual code which contains some codeword of weight n. 
By suitably multiplying columns by — 1 we can assume that € contains the 
codeword 1— 111--- 1. 

The goal of this section is to characterize the complete weight enumerator 


of € by proving: 


Theorem 12. If W(x,y,z) is the complete weight enumerator of a ternary 
self-dual code which contains 1, then 


W(x, y, 2) € Claw, Ré, 634] CO BeyisC[a2, Bi, 536] 
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(i.e., W(x, y, z) can be written uniquely as a polynomial in an, B2, 835 plus Beyis 
times another such polynomial), where 
æn = a(a! + 8p)), 
e= a? — 12b, 
yis = a^ —20a? p! — 8p5, 
ôx = p'(a? — p’)’, 


and 

a=x +y? +z, 

p = 3xyz, 

b = x’y? t+ x'zo yu. 
Notc that 


yis = Qh — 6465s. 


(The subscript of a polynomial gives its degree.) 
Proof. The proof follows the two stages described in §2. 


Stage I. Let a typical codeword u € € contain a 0’s, b 1’s, and c 2’s. Then 
since € is self-dual and contains 1 


u: u = 0 (mod. 3)>3|(b +c) 
(the Hamming weight is divisible by 3), 


u :1—0(mod.3) 2 3| (P — c) 2 3| b and 3 |c, 
1:1-70(mod.3)23|(a* b* c)23|a. 


Therefore W(x, y, z) is invariant under the transformations 


o 00 100\ /100 
010),  A^-[0eo]0101l, v=”. 
001 001/ \00e 


Also — u contains a 0’s, c 1’s, b Xs, and 1+ u contains c 0’s, a 1’s, b 2's. 
Therefore W(x, y, z) is invariant under 


100 010 
0014,400 14, 
010 100 


i.e., under any permutation of its arguments. 
Finally, from Equation (43) of Ch. 5, W(x, y, z) is invariant under 


1 11 1 
M= eo] 


1 o? o 
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These 6 matrices generate a group %,, of order 2592, consisting of 1944 
matrices of the type 


1 1 
s" "P x "Pi : s= enm, 
w? w’ 


and 648 matrices of the type 
1 
s" w? P, 
w? 


where 0< v =11,0 <a, b,c,d <2, e= 1 or 3, and P is any 3 x3 permutation 
matrix. 

Thus Stage I is completed: the assumptions about the code imply that 
W(x, y, z) is invariant under the group $;. 


Stage II. consists of showing that the ring of invariants of 9, is equal to 
Clair, Be, 536] CO BoyisClair, Bé, 8s]. First, since we have a list of the matrices in 
@,, itis a straightforward hand calculation to obtain the Molien series, Eq. (9). As 
usual everything collapses and the final expression is 


1 A7 

Pe qoasi 
This suggests the degrees of a good polynomial basis that we should look for. 
Next, 9, is generated by J;, M3, and all permutation matrices P. Obviously 
the invariants must be symmetric functions of x, y, z having degree a multiple 
of 3. So we take the algebraically independent symmetric functions a, p, b, 
and find functions of them which are invariant under J; and M;. For example, 
Bs is invariant under J;, but is sent into — B. by Ms. We denote this by writing 


Bs uiu Bs. diee. — Be. 


Therefore i is an invariant. Again 


M, 1 A 1 M, i 5 
a<— —, (a+ 2p) —— —z (a * 2wp) —— — (a+ 2v p), 


V3 v3 V3 
so another invariant is 

Qp = a(at2p)(a + 2wp)(a + 2w7p) = a(a’+ 8p’). 
Again 


J M; 
yis *——— yis, yis ~ — yis, 
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so Beyis is an invariant. Finally 
TELE UP RN ysl du )e—2 3-(a- we ) 
Ae yam m Van 


gives the invariant 
ô» = p'(a — p)'(a — ep)'(a — wp)’ = p*(a*— p’y’. 
The syzygy yis = a1; — 6485, is easily verified, and one can show that a, Bs, 536 


are algebraically independent. Thus fı = a; f2= Be, fs = 836, fa= Boyis is a 
good polynomial basis for $(G,), and the theorem is proved. Q.E.D. 


Remark. Without the assumption that the code contains the all-ones vector 
the theorem (due to R. J. McEliece) becomes much more complicated ([883, 
$4.7], Mallows et al. (892]). See also Problem 5: 


Applications of Theorem 12. For the ternary Golay code (Equation (6)) 
W = (5an + Be). For the [24, 12, 9] symmetry code of Ch. 16, 


We mai + $a ia Bà + mpe + HBeYis. 
The complete weight enumerator of the symmetry codes of lengths 36, 48 and 


60 have also been obtained with the help of Theorem 12 (see Mallows et al. 
[892]). 


Other generalizations. The following problems contain further theorems of the 
same type. Some results not given here are the Lee enumerator of a self-dual 
code over GF(7) (MacWilliams et al. (883, 85.3.2], Mallows and Sloane [894]); 
other split w.e.'s - see Mallows and Sloane [895]; and the biweight enumerator 
and joint w.e. ($6 of Ch. 5) of binary self-dual codes [883, $4.9, 85.4.1]. 
Rather less is known about the complete and Lee w.e.'s of a self-dual code 
over GF(q)- see [883, $85.3.1, 5.3.2] and Theorem 6 above. 

The Hamming w.e.'s of the codes of types (i) and (ii) of Theorem 1 are 
described in Theorem 3c and Problem 3. The other two types are as follows. 


Problems. (5) (Gleason [486], Berlekamp et al. [129], Feit [425, 883].) The 
Hamming w.e. of a self-dual code over GF(3) is a polynomial in Wi(x, y) 
(Equation (5)) and 


Wax, y) = xl Wax, y)! — Welx, y) 92 y'G? - yy. (42) 
[Hint: The group is generated by 


d 1 2 1 0 — 2ni/3 
M 1) and (oa) w= etn 


has order 48 and Molien series 1/(1— A*)(1 — A").] 
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(6) ([883].) (a) The Hamming w.e. of a self-dual code over GF(4) with all 
weights divisible by 2 is a polynomial in f = x? * 3y? and g=y(x?-y’y. 
[Hint: The group has order 12.] The Hamming w.e. of the [6,3, 4] code of 
example (E9) is f? — 9g. The [30, 15, 12] code mentioned on p. 76 of Ch. 16 has 
Hamming w.e. f? — 45f"g + 585f?g? — 2205f*g? + 1485f?g* — 3249g'. 

(b) A [2m, m] code over GF(4) with even weights is formally self-dual. If it 
is self-dual, then it has binary basis. 

(7) Lee enumerator for GF(5). Let € be a self-dual code over GF(5), with 
Lee enumerator 


L(x, y, z) ee > old yheozhon. 
ue 


where lo(u) is the number of 0’s in u, 1,(u) is the number of + 1’s, and l;(u) ts 
the number of +2’s (see p. 145 of Ch. 5). Then Z(x, y, z) is a polynomial in A, 
B and C, where 
A 7 x! * YZ, 
B-28x'YZ-2x'Y'Z + Y!'Z - x Y? - Z5, 
C = 320x5Y?Z? — 160x* Y? Z? + 20x Y'Z* -6Y? Z^ 
—4x(Y? + Z)2x' - 207 YZ + 5Y?Z?) 
+ Y¥°+Z", (43) 


and Y = 2y, Z = 2z. [Hint: the group has order 120. See [883, §5.3.2] and Klein 
(768, pp. 236-243].] The following examples illustrate this result. 


Generators for code Lee weight enumerator 
(12) A 
(100133, 010313, 001331) A'- iB 
(1122000000, 0000100122, 0000010213, 
1414141414, 2420430100) A L-iABG GC 


(8) (Feit [424], Mallows and Sloane [895].) Suppose n is odd and let € be an 
[n, Xn — 1)] binary code which is contained in its dual and has all weights 
divisible by 4. Examples are the little [7, 3, 4] Hamming and [23, 11,8] Golay 
codes, with w.e.'s 


Q;7 x! - 7x y*, (44) 
yn = x? + 506x" y° + 1288x"! y" € 253x?^ y. (45) 


Then n must be of the form 8m +1. If n = 8m - 1, the w.e. of € is an element 


of 
e:xC[ W(x, y), W(x, y)]  vaC[ W(x, y), Ws(x, y)l, (46) 


while if n = 8m + 1, the w.e. is an element of 


xC[ W(x, y), W(x, y)] © de C[ Wx, y), W3(x, y)], (47) 
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where 
dn; xU 17xP y 187x?^y* + Six ry” (48) 


is the w.e. of the [17,8, 4] code I found by Pless [1058]. Hence show that 
the w.e.'s of the [31, 15, 8] and [47, 23, 12] QR codes are respectively 


— 149g, W3(x, y) + ya Wx, y), 
H— 2530, Wx, y W Sx, y) + Ya WAX, vy)? — 41 W(x, yD}. 


For the corresponding theorems for binary self-dual codes with weights 
divisible by 2, and for ternary self-dual codes, see Mallows et al. [892], [895]. 

(9) Split weight enumerators. Suppose € is a [2m, m] even self-dual code 
which contains the codewords 0"1" and 1"0", and has the property that the 
number of codewords with (w:, we)= (j,k) is equal to the number with 
(Wr, Wr) = (k, j). (Here w; and we denote the left and right weights - see p. 149 
of Ch. 5.) Such a code is "balanced" about its midpoint, and the division into 
two halves is a natural one. Then the split w.e. Fe(x, y, X, Y) of € (p. 149 of 
Ch. 5) is a polynomial in ns, 616 and yz, where 


qa =X X +x Yt + y*X^- yY + 12xy!X?Y?, (49) 
bu = G7X* - y Y YEY? y XEY, (50) 
yu = xy’ X? Y*x'- yX‘ Y y. (51) 


(10) Examples of split w.e.'s. Use a detached-coefficient notation for F, and 
instead of the terms 


a(x^y^X*Y* Tx'y'"X'Y: +x’y XY’ Tx*y*X"Y*) 
write a row of a table: 


clo xy X Y # 
a abcd 4 


giving respectively the coefficient, the exponents, and the number of terms of 
this type. The sum of the products of the first and last columns is the total 
number of codewords. Use the preceding problem to obtain the split w.e.’s 
shown in Fig. 19.1 (taking the generator matrices for these codes in the 
canonical form of Equation (57) of Ch. 16). 

(11) Suppose II is a projective plane of order n, where n =2 (mod 4) (see p. 
59 of Ch. 2). Let A = (a;) be the (n? + n 1) x (n? +n + 1) adjacency matrix of 
II, where a;=1 if the i™ line passes through the j" point, and a, =0 
otherwise. Let € be the binary code generated by the rows of A, and let €* 
be obtained by adding an overall parity check to €. (i) Show that if n — 2, € is 
the [7, 4, 3] Hamming code. (ii) Show in general that 


€* is an [n? - n -2, Xn? - n -2), n 4 2] 


even self-dual code. (iii) Hence show that there is no projective plane of order 
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[8, 4, 4] 
Hamming 














[24, 12, 8] 
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3480176 
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Fig. 19.1. Split w.e.'s of even self-dual codes. 


6 [Hint: from Theorem 3c, since 84 44]. Note that the results of problem 6 
apply to the w.e. of @. The w.e. of the hypothetical plane of order 10 is 
discussed by Assmus et al. [43, 48], Mac Williams et al. [888] and Mallows and 


Sloane [895]. 


Research Problem (19.2). Let € be a binary code with weights divisible by 4, 
having parameters [n, in] if n is even or [n, Xn — 1)] if n is odd, with € C €". 
Characterize the biweight enumerator of $. 
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$5. The nonexistence of certain very good codes 


In this section Theorem 3c is used to obtain an upper bound (Theorem 13) 
on the minimum distance of an even self-dual code. It is then shown 
(Corollary 16) that this upper bound can only be attained for small values of 
n. The method generalizes the calculation of the w.e. of the [48, 24, 12] QR 
code given in $2. Similar results hold for the other types of self-dual codes 
given in Theorem 1. 

Let € be an [n,3n,d] even self-dual code, having w.e. W(x, y)= 
x" + Ax" "y! +--+. From Theorem 3c, W(x, y) is a polynomial in Wax, y) 
and Wi(x, y), and can be written as 


W(x, y)= > a,Wax, yy Wx, yy, (52) 


where n = 8j = 24u + 8v, v =0, l or 2. 
Suppose the u + 1 — [n/24] + 1 coefficients a; in (52) are chosen so that 


W(x, y) 2 x" + At ax" m yt es 


= W(xy)* (say). (53) 


Le., the a; are chosen so that W(x, y) has as many leading coefficients as 
possible equal to zero. It will be shown below that this determines the a; 
uniquely. The resulting W(x, y)* given by (53) is the weight enumerator of 
that even self-dual code with the greatest minimum weight we could hope to 
attain, and is called an extremal weight enumerator. 

If a code exists with weight enumerator W (x, y)*, it has minimum distance 
d* = 4u + 4, unless it should happen that Aisa in (53) is accidentally zero, in 
which case d* = 4y +8. But this doesn’t happen. 


Theorem 13. (Mallows and Sloane [893].) Až..4, the number of codewords of 
minimum nonzero weight in the extremal weight enumerator, is given by: 


(97 a 1)/ ars ?j ifn = Am, (54) 


1 (Sp)! TEM 
4 n(n — ln —2)(n —4) uu +A? ifn =24u +8, (55) 
! 
3 n(n-2) RTT if n = dp +16, (56) 


and is never zero. Therefore the minimum distance of an even 
self-dual code of length n is at most 4[n/24] + 4. 
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Proof. Here we give an elementary proof for the case n=24y; the other 
cases are similar. A second proof which applies to all cases simultaneously 
will be given below. 

By Theorem 9 of Ch. 6 the codewords of weight 4u + 4 form a 5-design. It 
is easiest to calculate the parameter A‘2,, of the consequent 4-design. From 
Equation (9) of Ch. 6 this is given by 
445 — 2:, 1$ (os = ‘) S(r) 


Ae 45-2 t= 
Apt no n p! TEAN r-4 jdn t4-r 


where 


4-1 
S(x)= [| Gu * 4i - »). 
jzt 
The following identities are easily verified. 


$(r-4)- S(r) | S(r) 
16 —4 "4p t4-r 


(465) 0-9-8 74) 


r=4 





24u 
= 5 aF 2) S(r) = S(24u), (Equation (14) of Ch. 6). 





Hence 
AM a -21- - su S212 OR D! 
Ai®,, = CHD- 2)! 
” p (4un-n " 
and so 
AS. - 4 yw 


Thus the number of codewords of weight 4u +4 is given by (54). 
Q.E.D. 


The known codes which attain the bound in Theorem 13 are shown in Fig. 
19.2. For the double circulant codes on this list see Fig. 16.7. 

The extremal weight enumerators for n « 200 and n = 256 (and in particular 
the w.e.'s of all codes in Fig. 19.2 are given in (893]. These were obtained 
from Theorem 15.) 
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[n, k, d] Code 

(8, 4, 4] Hamming code. 

[16, 8, 4] Direct sum of two Hamming codes. 
[24, 12, 8] Golay code. 

[32, 16, 8] 2" order RM or QR code. 

[40, 20, 8] Double circulant code (Fig. 16.7) 
[48, 24, 12] QR code. 

[56, 28, 12] Double circulant code. 

[64, 32, 12] Double circulant code. 

[80, 40, 16] QR code. 

[88, 44, 16] Double circulant code. 


[104, 52, 20] QR code. 
Fig. 19.2. Even self-dual codes with d = 4[n/24]+4 


Feit [426] has found a [96, 48, 16] even self-dual code using the |a - x | b 
x |a+b+x| construction of Ch. 18. All even self-dual codes of length «24 
are given by Pless [1058], Pless and Sloane [1062, 1063]. The first gap in Fig. 
19.2 is at n = 72. 


Research Problem (19.3). ((1227].) Is there a [72, 36, 16] even self-dual code? 


Problem. (12) Suppose € is a binary self-dual code (of type (i)), with length 
n — 8m and minimum distance 2m + 2. Show that the number of codewords 
of weight 2m + 2 is 
8m(8m — 1)(8m —2) C — d 
(2m *- 2)2m +1)⁄2m VJ /m—- 1 


But if n is very large then the extremal w.e. contains a negative coefficient, 
and the bound of Theorem 13 cannot be attained. This can be proved by the 
method used to prove Theorem 13, as shown in the following problem. 


Problem. (13) Suppose € is an even self-dual code of length n = 24u whose 
w.e. is the extremal w.e. W*(x, y). The codewords of each weight form a 
5-design. (a) Show that the parameter A$2.. of the 4-design formed by the 
codewords of weight 4u +8 is 


Mezco (o7. (27))- (rz) 


a eno 
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(b) Hence show that Azas <0 for sufficiently large p. 
An alternative, more powerful proof makes use of: 


Theorem 14. Bürmann-Lagrange (Whittaker and Watson [1414, p. 133], 
Goldstein [520], Good [533], Sack [1137-1139]). Let f(x) and (x) be analytic 
functions near x — 0, with ©(0) 40. Provided that the equation 


x = ed(x) 


defines x uniquely in some neighborhood of the origin, then f(x) can be 
expanded in powers of e as follows: 


f(x) = 19 Y 5 <2 = A [f(a (ay la -o, (57) 


valid for sufficiently small e. (The prime denotes differentiation.) 


To apply this theorem to the extremal weight enumerator, we first replace x 
by 1 and y* by x in (52). Then Wi(x, y) becomes f(x)=1+ l4x ^ x? and 
W(x, y) becomes g(x) = x(1— x)*. For simplicity suppose that n is a multiple 
of 24, so v = 0 and j = 3u; the other two cases are handled similarly. Equating 
(52) and (53) gives 





Wad afo ayl 3 AL. (58) 
Divide by f(xy": 
fay* =S a, (FS) - rey 3, az. (59) 


where 


gx) x(1— x)" 
f(xy (1+ 14x +x’) 


Let us expand f(x) ^ by Theorem 14 in powers of e = g(x)/f(x)*, with 


oc) -E- (Ee vy 


fx) = >. ae", (60) 


— x for x small. 





where a, = 1 and 


a, = 


UE ae eae 
r! dx" (lx i 
23 dO [aae 20d e ace as 
Ut dx (1—3)* : 


d 23-3 , 
[iex x) 


-0 





-0 
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for r - 1, 2,.... Comparing (59) and (60) we see that 
a,=a, forr=0,1,...,p, (62) 
and 
co bu 
E af gays- $X AR. (63) 
r=pti r=y+l 


So far we have proved: 


Theorem 15, ((893].) The extremal weight enumerator of an even self-dual code 
of length n = 24u is given by (58), where the coefficients a, are given explicitly 
by Equations (61), (62). Also, equating coefficients of x"*' and x"? in (63) 
gives 


Ada = — ber (64) 
and 


Ales = — Qua t (Ap. + 46)@ or. (65) 


Now a.s: «0 and Až.+4> 0 follow immediately from (61), giving another 
proof of the upper bound in Theorem 13. But the next coefficient becomes 
negative: 


Corollary 16. (Mallows et al. (891].) A2,.a « 0 for all n = 24y sufficiently large. 


Proof, After some messy algebra, (61) implies that |a,.;/o,..| is bounded for 
large u, and so Až.+s « 0 follows from (65). The details are omitted. 
Q.E.D. 


The proof shows that Ai,.s first goes negative when n is about 3720. 
Indeed, when n = 3720 a computer was used to show that 


W*(x, y)= x7? 4 Ad x y4 Až y + lege 


where Až, = 1.163... 10'?, A$,2 — 5.848 ...: 107, and AF > 0 for 632« i « 
3088. 

The same argument can be used to show that Corollary 16 holds also in the 
cases n= 244 +8 and n = 244 +16. An alternative method of showing that 
|@,.+2/a,+:| is bounded is given in [891]. This uses the saddle-point method to 
expand the expression inside the square brackets in (61). 

So far we have just considered codes of type (ii) of Theorem 1. Cor- 
responding results hold for the other types, with a similar proof. 
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Theorem 17, The minimum distance of a self-dual code over GF(q) with all 
weights divisible by t is at most d*, where: 


q t d* 

2 2 2 [=] +2 (66) 
2 4 dz] (67) 
3 3 3 [5] +3 (68) 
422 [E] +2 (69) 


But for all sufficiently large n there is no code meeting these bounds. 

Note that the McEliece et al. bound (Theorem 37 of Ch. 17) for binary 
codes of rate 3 is d/n < 0.178, and the Elias bound (Theorem 34 of Ch. 17) for 
codes of rate 3 is d/n < 0.281 (over GF(3)) and d/n = 0.331 (over GF(4)), as 
n — o, Thus (67), (68) are stronger bounds, while (66), (69) violate the latter 
bounds. 

The bound (66) is met by self-dual codes of lengths 2, 4, 6, 8, 12, 14, 22, 24 
but no other values of n (see (893, 1058, 1062, 1063], and Ward [1389] for the 
proof). Formally self-dual codes meeting (66) exist in additon for n — 10, 16, 
18, 28 and possibly other values ([893, 1389]). 

The bound (68) is met for n — 4, 8, 12 (the Golay code), 24, 36, 48, and 60 
(QR and symmetry codes), 16 and 40 (using a code with generator matrix 
[I| H] where H is a Hadamard matrix - H.N. Ward, private communication), 
and possibly other values (see [892]). Much less is known about codes 
meeting (69), but see [1478]. 


Research Problem (19.4), What is the largest n for which codes exist meeting 
the bounds of Theorem 17? 


The following stronger result is also proved in (891]. For any constant b 
there is an no(b) such that all self-dual codes of length greater than no(b) have 
minimum distance less than d* — b, where d* is given by Theorem 17. 


86. Good self-dual codes exist 


This section counts self-dual codes in various ways and shows that some of 
them meet the Gilbert- Varshamov bound. For simplicity only the binary case 
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is considered - for the general case see Mallows et al. [892], Pless [1053], 
[1054], Pless and Pierce [1061], and also Zhe-Xian [1457]. 
If € C €* then € is called weakly self-dual (w.s.d.) 


Theorem 18. (MacWilliams et al. [887].) Let n —2t and suppose © is an [n, k] 
w.s.d. (binary) code, with k 21. Then the number of [n,t] self-dual codes 
containing © is 


i (2' +1). (70) 


Proof. Let Cnm, k € m <t, be the number of [n, m] w.s.d. codes which contain 
€. We establish a recursion formula for o,,. An [n,m] w.s.d. code @ 
containing € can be extended to an (n, m +1] w.s.d. code containing € by 
adjoining any vector of Z* not already in 2. Write D+ as the union of 2" "/2" 
cosets of 9, 

9*-9U(h-* 9)U --- U(h + 9), 


where | —2"?"—]|, There are | different extensions of Z, namely QU 
(h; * 2) for j2 1,2,...,l. In each of these extensions there are 2"* *— 1 
[n, m] subcodes 2' which contain €, since that is the number of nonzero 
vectors in D U (h; + 2) which are orthogonal to €. Thus for k m « t, 


2n-2m ==, 1 
Fame = Onm ` ARTE |" (71) 


Starting from ną = 1 gives (70). Q.E.D. 


Corollary 19. The total number of binary self-dual codes of length n is 
jn-1 
[I +n. (72) 
izt 


Proof. Take € = (0, 1} in Theorem 18. Q.E.D. 


Corollary 20. Let v be a binary vector of even weight other than 0 or 1. The 
number of self-dual codes containing v is 


in-2 


II +D. (73) 


Theorem 21. ((887].) There exist long binary self-dual codes which meet the 
Gilbert- Varshamov bound. 
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Proof. This follows from Corollaries 19, 20 in the usual way - see Theorem 31 
of Ch. 17. Q.E.D. 


A corresponding argument shows that the same result holds for self-dual 
codes over GF(q) ([1061]). The hardest case is that of even self-dual codes, 
and the reader is referred to [887] for the proof of the next result. 


Theorem 22. Let n be a multiple of 8, and suppose € is an [n, k] w.s.d. binary 
code with all weights divisible by 4. The number of [n, in] even self-dual codes 
containing © is 


2 [] œ+. (74) 


Corollary 23. The total number of even self-dual codes of length n is 


in -2 


2 [] 2+1). (75) 


i-0 


The number which contain a given vector v other than 0 or 1, with wt(v) =0 
(mod 4), is 


In-3 


2 II (2' + 1). (76) 


Theorem 24 (Thompson [1320a]; see also (887].) There exist long even self-dual 
codes which meet the Gilbert- Varshamov bound. 


For the next six problems, let ®,, =the class of w.s.d. binary [n, k] codes, 
$;, =the subclass of ®,, of codes containing 1, for 0 < k < n/2 (see [1063]). 


Problems. (14) Let n be even and € € $7... Show that the number of codes in 
@!_{k = s) which contain € is 


k-s-1 3n-2s-21 c= 1 
MM 2 NS 1 s 
(15) Show that the total number of codes in ®;, is 


k-i 2^7 - 1 


FIT if n even, 0 if n odd. 
i=l m 
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(16) Let € € Pns — Phs. Show that the number of codes in $,, — Pix (k = s) 
which contain €, is 


(17) The total number of codes in $,, — hs is 


k 25-2 = 


ak 
H j=1 2—1 





(n even), 


(18) Let n be even and € € ns — $;,. The number of codes in ®„« (k > s) 
which contain € is 


(29- F3 Lr 1) Ir (252 ul »/Ti (2i i 1). 


(19) If n is even, the total number of codes in ®,, is 


k-1 k 
(2"-* — DII Q^ — »/TI Q! — 1). 


(20) Show that the sum of the weight enumerators of all self-dual codes of 
length n is (for n even) 


(nf2)~2 Qa [sessi +y")+> (ev. 


je 2li 


The same sum for even self-dual codes is (if n is divisible by 8) 


(n/2)-3 
II 2+ y EC +y)+> (v. 
j-0 4ļi 
(21) ([1053, 1054].) Let € be an [n,k] w.s.d. code over GF(3) which is 
maximal in the sense of not being contained in any longer w.s.d. code of the 
same length. (a) Show that 


Xn-—1) if n is odd, 


| in if n 2 0 (mod. 4), 
k= 
3(n—-2) if n =2 (mod. 4). 
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(b) The number of such maximal codes is 


(n—2)/2 


2 [| G'*D if n=0 (mod. 4), 


i=] 


(n—1)/2 : 
[] G'*D if n is odd, 


ni2 A 
I[G' +1) if n=2 (mod. 4). 
=2 


(c) Thus a self-dual code exists iff n is a multiple of 4. 

(22) ([1054].) More generally, show that an [n, 3n] self-dual code exists over 
GF(q) iff one of the following holds: (i) q and n are both even, (ii) q = 1 (mod. 
4) and n is even, and (iii) q — 3 (mod. 4) and n is a multiple of 4. 

(23) ([1054].) Show that if the conditions of the previous problem are 
satisfied then the number of [n,in] self-dual codes over GF(q) is 


in-1 


b TT (a * 5. 


where b = 1 if q is even, b =2 if q is odd. 


Research Problem (19.5). How many inequivalent self-dual codes of length n 
are there? (See [1058, 1059, 1062, 1063, 892] for small values of n.) 


Notes on Chapter 19 


There are many parallels between self-dual codes and certain types of 
lattice sphere packings. There are analogous theorems about lattices to most 
of the theorems of this chapter. See Berlekamp et al. [129], Broué [200], 
Broué and Enguehard [201], Conway [302-306], Gunning [570], Leech [804, 
805, 807-809], Leech and Sloane [810], Mallows et al. [891], Milnor and 
Husemoller [962], Niemeier [992] and Serre [1184]. 

The ALTRAN system (Brown [205], Hall [579]) for rational function 
manipulation makes it very easy to work with weight enumerators on a 
computer. 

This chapter is based in part on the survey [1231]; see also [1228]. Another 
reference dealing with self-dual codes is [141]. For more about self-dual and 
formally self-dual codes over GF(4) see [1478]. 


The Golay codes 


§1. Introduction 


In this chapter we complete the study of the Golay codes by showing that 
their automorphism groups are the Mathieu groups, and that these codes are 
unique. There are four of these important codes: [23, 12,7] and [24, 12,8] 
binary codes, and [11,6,5] and [12,6,6] ternary codes, denoted by Gz, Goa, 
G1, €, respectively. 


Properties of the binary Golay codes. We begin with €, (rather than &,3) since 
this has the larger automorphism group and is therefore more fundamental. 
The [24, 12, 8] code 44 may be defined by any of the generator matrices given 
in Fig. 2.13, Equations (41), (48) of Ch. 16, or Equation (6) below; or by the 
[a * x|b * x|a * b + x| construction (Theorem 12 of Ch. 18); or by adding an 
overall parity check to G3. G24 is self-dual (Lemma 18 of Ch. 2, Theorem 7 of 
Ch. 16), has all weights divisible by 4 (Lemma 19 of Ch. 2, Theorem 8 of Ch. 
16), and has weight distribution: 


i; 08 12 16 24 
A: 1 759 2576 759 1 (1) 


The codewords of weight 8 in %, form the blocks of a Steiner system 
$(5, 8, 24) (Corollary 23 of Ch. 2; see also Corollary 25 and Theorem 26 of Ch. 
2). These codewords are called octads, and the same name is also used for the 
set of eight coordinates where the codeword is nonzero. Let 0 = (xix: * * xs} 
be an octad and denote the number of octads which contain x,- x; but not 
Xt te X by A; (0j <i <8). These numbers are independent of the choice 
of O (Theorem 10 of Ch. 2), and are tabulated in Fig. 2.14. We will show 
below (Theorem 9) that the Steiner system S(5, 8, 24) is unique. Since there is 
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a generator matrix for %4, all of whose rows have weight 8, it follows that the 
octads of S(5,8,24) generate a. This is used in 86 to show that G, is itself 
unique (Theorem 14). The codewords of weight 12 in G are called dodecads. 

The automorphism group of 4 contains PSL,(23) (Theorem 10 of Ch. 16), 
but is in fact equal to the much larger Mathieu group M,, (Theorem 1). This is 
a 5-fold transitive group, of order 24: 23: 22: 21: 20: 48 = 244823040 and is 
generated by PSL;(23) and an additional permutation W given in Equation 
(8). 

The [23, 12, 7] perfect code 4;, may be obtained by deleting any coordinate 
of €, (it doesn't matter which, by Corollary 11 of Ch. 16), or as a quadratic 
residue code with idempotent equal to either of the polynomials in Equation 
(10) of Ch. 16, and generator polynomial given in Equation (4) of Ch. 16. The 
weight distribution is: 

i: 0 

A: 1 

The codewords of weight 7 in €,, form the blocks of a Steiner system 

S(4,7, 23), and these codewords generate G,;,. Both n and S(4,7,23) are 
unique (Corollary 16 and Problem 14). 

The full automorphism group of $, is the Mathieu group M; (Corollary 8). 
This is 4-fold transitive group of order 23: 22:21 : 20: 48 = 10200960. 

Decoding methods for these codes are given in $9 of Ch. 16. 


7 8 11 12 15 16 23 
253 506 1288 1288 506 253 1 (2) 


Properties of the ternary Golay codes. The [12, 6,6] code 4,; may be defined 
by any of the generator matrices given in Equations (25) or (61) of Ch. 16, or 
Equation (13) below; or by adding an overall parity check to 4. G is 
self-dual (Theorem 7 of Ch. 16), so has all weights divisible by 3. The 
minimum distance is 6 (from Theorem 1 of Ch. 16), hence the Hamming and 
complete weight enumerators are as given in Equation (6) of Ch. 19. The 
supports (the nonzero coordinates) of codewords of weight 6 form the 132 
blocks of the Steiner system S(5, 6, 12) (see Fig. 16.9). Both G. and S(5, 6, 12) 
are unique (Theorem 20 and Problem 18). 

The automorphism group Aut(4,;)t (p. 493 of Ch. 16) contains a group 
isomorphic to PSL;(11) (Theorem 12 of Ch. 16), but is in fact isomorphic to 
the much larger Mathieu group M;; (Theorem 18). This is a 5-fold transitive 
group of order 12: 11: 10-9 -8 — 95040. 

The [11, 6,5] perfect code 4,, may be obtained by deleting any coordinate 
of 4, (again it doesn’t matter which), or as a quadratic residue code with 
idempotent and generator polynomials given by Equations (16), (5) of Ch. 16. 
The Hamming and complete weight enumerators of 4, are given by Mallows 
et al. [892]. The supports of the codewords of weight 5 form the blocks of the 
Steiner system S(4,5,11). 4,, and S(4,5,11) are unique (Corollary 21 and 
Problem 18). 
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82. The Mathieu group Mza 


In this section the Mathieu group M2, is defined and shown to preserve €. 


Notation. The coordinates of €, will be labeled (0, 1,..., 22}, and the coor- 
dinates of €, by Q ={0,1,...,22,œ}, the last coordinate containing the 
overall parity check. Also 


Q = (1, 2,3, 4,6, 8, 9, 12, 13, 16, 18}, 
N = {5,7, 10, 11, 14, 15, 17, 19, 20, 21, 22} (3) 


denote the quadratic residues and nonresidues mod 23. 
For concreteness we take @,; to be the cyclic code with idempotent 


(x) = DES (4) 
and generator polynomial 
(1t x t x7)8(x) = 1 x7 + x*- x? x^- x" x". (5) 


@,, is then obtained by adding an overall parity check to %, and has 
generator matrix 


: (6) 





where TI is the 23 x 23 circulant whose first row corresponds to 6(x). The 
(i+ 1^ row of (6) is |x'e(x)|1], O< i «22. 
From Theorem 10 of Ch. 16, 4,, is preserved by the group PSL,(23), which 
i 


has order 1:23-(231—1) 2 66072, and is generated by the following per- 
mutations of 2 (Equation (30) of Ch. 16): 


S: ioi-c1, 
V: i2i, 
T: i-i (7) 


In other words, 
S=(~)0 123 :-- 22), 
V =(~)(0)(1 2 4 8 16 9 18 13 3 6 12) 
(5 10 20 17 11 22 21 19 15 7 14), 


T =(% 0X1 22)2 11X3 15X4 17X5 9X6 19) 
(7 13)(8 20X10 16)(12 21)(14 18). 
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Definition. The Mathieu group Mx is the group generated by S, V, T and W, 
where 


o to 0, 0 to oo, 
Wsends;ito —(i) ificeQ, 
ito 2i  ifi€N, (8) 


or equivalently 


W =(~ 0)3 15)(1 17 6 14 2 22 4 19 18 11) 
(5 8 7 12 10 9 20 13 21 16). 


Theorem 1. M2, preserves Gu. 


Proof. We have only to check that W fixes Gu. It is easily verified that 
W(ecol1D =]O(x)] 1] + 1€ Gow. 
W(|xO(x)]1]) =| x70(x) + x" 600 + x?6()]1] € Gos, 
W(|x?6(x)]1D =] O(x) + x6(x) + x6(x) + x?8(x)]0| € Gos. 
We now make use of the identity 


VW = WV’ = (e 0)(18 2111 22 16 20 6 10 13 15 12 17) 
(2 193 14859 11 4 7). (9) 


Since V(|x‘'@(x)|1]) 2 |x"6(x)|]1], we have 


W(x^"6(x)|1D = (VWX1x'660] 1D = (WV?X1x'6G)] 1p, 
and so W transforms every row of (6) into a codeword of €, Q.E.D. 


83. M. is five-fold transitive 
Theorem 2. Mz, is five-fold transitive. 


Proof. Mz, contains the permutation U = W^: 


U = (~)(0)(3)(15)(1 18 4 2 6)(5 21 20 10 7X8 16 13 9 12)(11 19 22 14 17). 
(10) 


Mz is generated by S, T, U, V, since W = TU’. By multiplying the generators 
we find permutations of the following cycle types: 1 - 23, 17-11’, 1°- 7’, 1*- 5*, 
1* - 25, 2? and 4°. (For example, S, V, US’, U, (SUY, T and (S" TU?y.) 

The permutations of cycle types 1-23 and 2" show that Mz is transitive. 
By conjugating (using Problem 1), we see that the stabilizer of any point (the 
subgroup leaving that point fixed) contains a permutation of type 1 - 23, so is 
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transitive on the remaining 23 points. Therefore M2. is doubly transitive. 
Again by conjugating, the stabilizer of two points contains the types 1?- 11? 
and I^: 7, so is transitive on the remaining 22 points. Therefore Mz is triply 
transitive. Similarly the stabilizer of three points contains the types 1°: 7° and 
1*- 5*, so is transitive on the remaining 21 points. Therefore M,, is quadruply 
transitive. The subgroup fixing a set of 4 points as a whole coritains the types 
1*- 5* and 4°, so is transitive on the remaining 20 points. Therefore M2, is 
transitive on 5-element subsets of . The subgroup fixing any 5-element 
subset as a whole contains the types 1*- 5* and 1°- 2°, which induce per- 
mutations of types 5 and 1°-2 inside the 5-set (again conjugating using 
Problem 1). Since the latter two permutations generate the full symmetric 
group on 5 symbols (Problem 2), M24 is quintuply transitive. Q.E.D. 


Problems. (1) Let 7, o be permutations, with m written as a product of 
disjoint cycles. Show that the conjugate of v by o, & mo, is obtained from m 
by applying ø to the symbols in m. Eg. if m = (12)(345), o = (1524), then 
a mo = (54312). 

(2) Show that any permutation of {1,2,...,n} is generated by the per- 
mutations (12 - : - n) and (12). 

(3) Show TVT = V` and U'VU = V’. 

(4) Show T = W’, so Mj is generated by S, V and W. 

(5) Show that U sends x > x’/9 if x 2 0,0 or x€ Q, and x 9x? if x EN. 

(6) Show that M; is transitive on octads. (Hint: use Theorem 2.] 


$84. The order of Mz is 24 - 23- 22- 21- 20 - 48 


First we need a lemma. Let T` be a group of permutations of a set S, and let 
T be a subset of S. y € I sends T into y(T) = {y(t): t € T). The set of all 
y(T) is called the orbit of T under T, and is denoted by T". Let I’; be the 
subgroup of T fixing T setwise (i.e. if t € T and y € I, y(t) € T). 


Lemma 3. 
Ir| 2 |r7|-|T*]. 


The most important case is when T consists of a single point. 


Proof. For y, 6€ F we have y(T) = 6(T) iff y8 €T. Thus | T"| is equal to 
the number of cosets of Tr in T, which is |T |/| P; |. Q.E.D. 
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Theorem 4. 
| M&4| = 24-23 - 22-21 - 20 - 48 = 244823040. 


Proof. Suppose (a, b, c, d, e, f, g, h} is an octad. Let I be the subgroup of My 
fixing this octad setwise, and let H be the subgroup of I which fixes in 
addition a ninth point i. Since Mz, is transitive on octads (Problem 6), 
|Ma] = 759|I| by Lemma 3. The calculation of |F| is in four steps. (i) We 
show H is a subgroup of I" of index 16, so | | = 16| H |. (ii) By looking at the 
action of H on the remaining .15 points, we show H is isomorphic to a 
subgroup of GL(4,2), so |H|<|GL(4, 2)| - 20160. (iii) By looking at the 
action of H inside the octad, we find H contains a subgroup isomorphic to the 
alternating group Ás, of order 38! = 20160. (iv) Therefore H = GL(4,2) = £s, 
|H | = 20160, || = 16 - 20160, and | M24| = 759 - 16 - 20160 = 244823040. 


Step (i). A permutation m of Aut (%24) which fixes 5 points setwise must fix 
the octad © containing them (or else wt(O + 7T(0)) «8). M contains a 
permutation of type 1-3-5- 15, e.g. US". The octad containing the 5-cycle in 
this permutation is fixed by it and so must be the union of the 3- and the 
S-cycles. Then by conjugating US", using Problem 1, we may assume F 
contains the permutation A = (abcde)Y(fghY (kl - - - x). Mà also contains a 
permutation of type D-2-4- 8, e.g. US'. The octad containing one of the 
fixed points and the 4-cycle must be the union of the 4-, 2-, and I-cycles. 


Therefore I contains a permutation which fixes the octad (a, ..., h} setwise 
and permutes the remaining 16 points (i, j, k, .. ., x) in two cycles of length 8. 
Thus T is transitive on (i... , x), and so H, the subgroup fixing i, has index, 16 


in P by Lemma 3. 


Step (ii). Let € be the code of length 15 obtained from those codewords of 


G,, which are zero on the coordinates (a, b,..., i}. Since €, is self-dual we 
know exactly the dependencies in €,, among these 9 coordinates — there is 
just one, corresponding to the octad {a,b,...,h}. Therefore € has size 


2"|2* = 2*. Furthermore € only contains codewords of weights 8 and 12, and 
12 is impossible or else 4,4 would contain a word of weight 20. Therefore € is 
a [15, 4,8] code containing 15 codewords of weight 8. It follows (Problem 7) 
that € is equivalent to the simplex code with generator matrix 


111111110000000 
111100001111000 
110011001100110 
101010101010101 


(11) 


and has automorphism group GL(4, 2). 
Each nontrivial permutation in H induces a nontrivial permutation of €. 
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(For if h, h' € H induce the same permutation on €, then h™'h’ has 16 fixed 
points, and (Problem 8) only the identity permutation in M,, can fix 16 points.) 
Thus H is isomorphic to a subgroup of GL(4, 2). 


Step (iii) Let H, be the group of permutations on the octad (a,...,h] 
induced by H. The 5" power of the permutation A defined in Step (i) gives 
(feh) € H;, and by conjugating (fgh) we can get all 3-cycles. Therefore H, 
contains the alternating group Xs (which is generated by 3-cycles, by Problem 
9). Q.E.D. 


This theorem has a number of important corollaries. 
Corollary 5. M. is the full automorphism group of Gas. 


Proof. Let M = Aut (Gx). We know M D M4. The proof of Theorem 4 goes 
through unchanged if M;, is replaced throughout by M. Hence | M | = 244823040 
and M = Mx. Q.E.D. 


Corollary 6. The subgroup of M» fixing an octad setwise has order 8.8! 


Corollary 7. GL(4, 2) = sf; (Corollaries 6 and 7 follow directly from the proof of 
Theorem 4.) 


Definition. The Mathieu group M» consists of the permutations in M», fixing a 
point in 2. Thus M» is a 4-fold transitive group of order 23 - 22-21: 20 - 48. 


Corollary 8. Mz; is the full automorphism group of Gas. 


Problems. (7) Let € be any [15, 4, 8] code containing 15 codewords of weight 8. 
Show that € is equivalent to the simplex code with generator matrix (11). 
Hence show that the automorphism group of € is GL(4, 2). [Hint: Theorem 24 
of Ch. 13.] 

(8) Show that a permutation m in Mz, which fixes 16 points (individually), 
where the other 8 form an octad, must fix all 24 points. [Hint: Suppose 7 fixes 


8,...,9 where t={0,...,7} is an octad. An octad s with |sMt|=4 is 
transformed by az into s or t+ s. By considering all such s, show ~ is the 
identity.] 


(9) Show that the alternating group 5$, is generated by all 3-cycles (ijk). 
(10) Show that the pointwise stabilizer of an octad is an elementary abelian 
group of order 16, and is transitive on the remaining 16 points; and that the 
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pointwise stabilizer of any five points has order 48. [Hint: in the first part use 
the permutation (US?Y, and in the second use 47]. 
(11) Show that M2, is transitive on dodecads. 


85. The Steiner system $(5,8, 24) is unique 


The goal of this section is to prove: 


Theorem 9. There is a unique Steiner system S(5, 8, 24). More precisely, if there 
are two Steiner systems S(5, 8,24), £ and B say, then there is a permutation 
of the 24 points which sends the octads of s onto the octads of B. (In this 
section an octad means a block of an S(5,8, 24).) 


The proof makes good use of the table of A;'s given in Fig. 2.14. In 
particular, the last row of this table implies that two octads meet in either 0, 2 
or 4 points. We begin with some lemmas. 


Lemma 10. (Todd's lemma.) In an S(5,8, 24) if B and C are octads meeting in 
4 points then B +C is also an octad. 


Proof. Let B = {abcdefgh}, C = (abcdijkl) and suppose B + C is not an octad. 
Then the octad D containing (efghi) must contain just one more point of C, 
say D = (efghijmn). Similarly the octad containing (efghk) is E = {efghklop}. 
But now it is impossible to find an octad containing {efgik} and meeting 
B, C, D, E in 0, 2 or 4 points. This is a contradiction, since there must be an 
octad containing any five points. Q.E.D. 


Lemma 11. If the 280 octads meeting a given octad © in four points are known, 
then all 759 octads are determined. 


Proof. From Fig. 2.14 we must find 30 octads disjoint from © and 448 meeting 
O in two points, i.e. 16 octads meeting Ó in any two specified points a and b. 

Let 0 = (xyzab - - -). There are four octads, besides 0, through {xyza}, say 
A, Az, As, As, and four through {xyzb}, say Bi, B2, B, Bs. The sums A, * B; 
are octads by Lemma 10, are readily seen to be distinct, and are the 16 octads 
through a and b. The 6 sums A, * A, are octads which are disjoint from C. 
Clearly these are distinct and A; + A, z B; + B,. Since there are 5 choices for a 
these give the 30 octads disjoint from 0. Q.E.D. 
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Definition of sextet. Any four points abcd define a partition of the 24 points 
into 6 sets of 4, called tetrads, with the property that the union of any two 
tetrads is an octad. This set of 6 tetrads is called a sextet (= six tetrads). (To 
see this, pick a fifth point e There is a unique octad containing {abcde}, say 
{abcdefgh}. Then {efgh} is the second tetrad. A ninth point i determines the 
octad {abcdijkl} and the tetrad {ijkl}. By Todd's lemma (efghijkl) is an octad, 
and so on.) 


Lemma 12. An octad intersects the 6 tetrads of a sextet either 3- 1°, 4 - 0* or 
2*. 07. (The first of these means that the octad intersects one tetrad in three 


points and the other five in one point.) 
Proof. Two octads meet in 0, 2 or 4 points. Q.E.D. 


Lemma 13. The intersection matrix for the tetrads of two sextets is one of the 
following: 


400000 220000 200011 310000 
040000 220000 020011 130000 
004000 002200 002011 001111 
000400 |, 002200 |, 000211 |, 001111 
000040 000022 111100 001111 
000004 000022 111100 001111 
Proof. From Lemma 12 and the definition of a sextet. Q.E.D. 


Proof of Theorem 9. In a Steiner system S(5, 8,24) let O be a fixed octad. The 
idea of the proof is to determine uniquely all the octads meeting 0 in 4 points; 
the theorem then follows from Lemma 11. To find these octads we shall 
construct 7 sextets $,,..., S7 

Let xı +++ X, be six points of € and x, a point not in €. Suppose S, is the 
sextet defined by (xix;xx4) and let its tetrads be the columns of the 4x6 
array: 


Xı Xs X; Z4 2g Zn 
$i = X Xe Zi Zs 29 213 y 

X3 Yı Z2 Ze Zio 214 

X4 Yo Z3 27 Zu Zis 


Thus © consists of the two left-hand columns. 
The octad containing {x.x3x4x5X7} must intersect S, 3 - P, by Lemma 12, and 
so, after relabeling the z, can be taken to be {x.x3X4XsX7ZsZeZi2}. The tetrad 
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{x.X3X4Xs} determines a sextet which by Lemma 13 can be taken to be 


21/33 |33 
gpl 44|44 
12|5555 
12166766 


This diagram means that the tetrads of S; are (xoxsxaxs), {X:Xeyi Ya}, 0602428233), 
Ízzszozu), {2226210214} and (z»z;zuzi;). In this notation 


Remark. We have now identified the 30 octads disjoint from ©. They are (i) 
the sums of any two rows of 

X Z4 | Zs Zo 

2, 25 29 ZB 

Z2 Ze | Zw Zia], 

Z3 27 Zu Zis 


(ii) the sums of any two columns of this array, and (iii) the sums of (i) and (ii). 
These are illustrated by 





The octad containing {x,x3x4x5Xx7} intersects both S, and S; 3 - 1°, and so can 
be taken to be {x:X3X4XsX7ZsZi0Zis}. The tetrad {x,x3x4x5} defines a sextet which, 
using Lemma 13 and considering the intersections with the 30 octads men- 
tioned in the above Remark, can be taken to be 


At this stage we observe that the sextets S,, S; and S; are preserved by the 
following permutations of (ysyszizz °° * zu: 
ar = (ZaZaZi)(25Z:9215)(2122 23) (202 423) (29211213); 
O = (z4ze)(z:2u)(2122 (21321925 19) (2829), 
= Qn. 
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The octad containing {x,x2.x3x;x7} cannot contain any of 2,2223242g2122s210 OF 
Zis because of the intersections with the previous octads, and so must be 
either 





Since these are equivalent under o the octad can be taken to be 
{X1X2X3XsX727ZoZia}. The tetrad (xixxsx;) then determines the sextet 


11134/56 
s.=|12 56|34 
12165) 43 
22143!65 
In the same way we obtain 
11134156 
s-11? 65|43 
12:9 [43:1 6:5 
12156134 


Now the octad containing (xixixsxex;) cuts S, 2*- 0? and so, using m, may be 
assumed to intersect the first four columns of S, 2^. So we can take this octad 
to be (xixaxsxexsziz42:). This gives the sextet 


Se = 


UA CA UVU 
UA mW WwW 
ADA AL 
DNA HL 


11 
11 
2:2 
22 


Similarly, using o, we can take the octad containing {x,x3x5x7z2} to be 
Gxixaxsxsyiz22424), giving the sextet 


11/3355 
s,-|22 44166 
7 7|11133455 

22144166 


It remains to show that the 280 octads meeting @ in 4 points are determined 
by $,,...,9;. 

If two sextets meet evenly (i.e. if their tetrads intersect in the third matrix 
of Lemma 13), then we can add suitable octads in them to get a new octad and 
sextet. For example, from S, and S;, 
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HUNE SENE 


octad + octad new octad 


r 


= new sextet. 


UA QN AN 
UA QN QN tA 


2.2 
11 
11 
22 


wh oA oU 
wr RW 


It is easily checked that in this way we obtain all X) = 35 sextets defined by 
four points of €. But these sextets give all the octads meeting © in 4 points. 
Q.E.D. 


Witt’s original proof of Theorem 9 was to successively show the 
uniqueness of S(2, 5,21), S(3,6,22), S(4, 7,23) and finally S(5,8,24). The 
starting point is: 


Problems. (12) Show that the affine planes S(2, m, m?) are unique for m = 
2,3,4,5 (see 85 of Ch. 2 and Appendix B, Theorem 11). (Hint: Pick a family 
of m nonintersecting blocks in S(2, m, m?) and call them “horizontal lines,” 
and another family of nonintersecting blocks called ''vertical lines." The point 
at the intersection of vertical line x and horizontal line y is given the 
coordinates (x, y). Each of the m^— m remaining blocks consists of points 
(1, a2, a2) -- - (m, an). Form the (m? — m) x m matrix A(m) whose rows are 
the vectors (a1a2* * * Gm). Show that A(m) is essentially unique for m = 2—5.] 

(13) Show that the projective planes S(2, m ^ 1, m? *- m +1) are unique for 
m =2-5. [Hint: Since the affine planes are unique they are the planes 
constructed from finite fields (given in 82 of Appendix B). There is a unique 
way to extend such a plane to a projective plane.] 

(14) Show that the Steiner systems S(3,6,22) and S(4, 7,23) are unique. 
(Hint: Let P be a fixed point of an S(3, 6,22). The idea is to show that the 
blocks containing P belong to an S(2, 5, 21), unique from Problem 13, and the 
other blocks form ovals (p. 330 of Ch. 11) in S(2, 5,21). See Witt [1424] or 
Lüneburg [866].] 

(15) (a) Show that a Hadamard matrix Hs of order 8 is unique up to 
equivalence (cf. p. 48 of Ch. 2). [Hint: Use the fact that S(2, 3, 7) is unique.] 
(b) Show that H;; is unique. 

(16) Show that the Steiner system S(3, 4, 8) is unique, and hence so is the 
(8, 4, 4] extended Hamming code. 
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$6. The Golay codes $,, and G2, are unique 


Theorem 14. Let € be any binary code of length 24 and minimum distance 8. 
Then (i) | €| € 2". Gi) If |€| 2 2", € is equivalent to Gz. 


Proof. (i) This follows from linear programming ($4 of Ch. 17, see also Ch. 17, 
Problem 16) or from the sphere-packing bound (Theorem 6 of Ch. 1). (ii) 
Suppose |€|-2". Then the linear programming bound shows that the 
distance distribution of € is 


o= B7 1, Bg = By = 759, Bi; = 2576; (12) 
i.e. is the same as that of G,,. The transform of (12), Equation (23) of Ch. 17, 
coincides with (12). Hence it follows from Theorem 3 of Ch. 6 that the weight 


and distance distributions of € coincide, assuming € contains 0. 
Let a, b be two codewords of €. From the formula (Equation (16) of Ch. 1) 


dist (a, b) = wt(a)+ wt(b) —2wt(a * b) 
it follows that wt(a * b) is even (since all the other terms are divisible by 4). 


Hence every codeword of € is orthogonal to itself and to every other 
codeword. The following simple lemma now implies that € is linear. 


Lemma 15. 
Let A, B be subsets of F" which are mutually orthogonal, i.e. 


> ab; -0 for all (a,--+a,)€ A, (bi; b.)€ B 


i=l 


Suppose further that | A|=2" and |B|>=2"*'+1. Then A is a linear code. 


Proof. Let A and B be the linear spans of A and B. Clearly A, B are mutually 
orthogonal, hence 


dim A+dim B <n. 


But by hypothesis dim A >k and dim B = n — k. Thus dim A=k and A = A. 
Q.E.D. 


Theorem 22 of Ch. 2 proves that the codewords of weight 8 in € are the 
octads of a Steiner system S(5, 8,24). By Theorem 9 this system is unique, 
and as mentioned above generates a code equivalent to G4. Hence € is 
equivalent to gz. Q.E.D. 


Corollary 16. Any (23, 27,7) binary code is equivalent to 8z. 
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Proof. Add an overall parity check and use Theorem 14. Q.E.D. 


§7. The automorphism groups of the ternary Golay codes 


In this section we show that the automorphism group Aut (4,.)T is isomor- 
phic to the Mathieu group M». Recall from Ch. 16 that if € is a ternary code, 
Aut(€) consists of all monomial matrices A (permutations with + signs 
attached) which preserve €, and 


Aut (€)t = Aut (€)/(- I}, 


which is the quotient group obtained by identifying A and — A. Thus 
[Aut (€)| = 2| Aut (€)1]. 

Notation. The coordinates of G, are labeled by Q ={0,1,...,9, X,o»), and 
Q = {1,3, 4,5,9}, N —(2,6,7,8, X). As the generator matrix for 9,2 we take 


DN (13) 





where TI is the 11 x 11 circulant whose first row corresponds to 


-1-5 X. 
ieQ i€N 
i.e. is the vector (--1———111— 1), writing — for -1. (Check that this matrix does 
generate 144,,— see p. 487.) 

From Theorem 12 of Ch. 16, G2 is preserved by the group, isomorphic to 
PSL,(11), which is generated by S, V and T’, where S and V are the 
permutations 

S: i-itl, 

V: i3i, (14) 
and T" is the monomial transformation which sends the element in position i to 
position —1/i after multiplying it by 1 if i= © or i€Q, or by — l ifi 0ori c N. 

In addition, Y,2 is also preserved by the permutation 


A = (9)(0)(1)2X )34)(59)(67)(8). (15) 


For it is readily checked that if wi(i =0,1,..., 10) is the (i + 1)" row of (13), 
then 


A(wi) = Wsay 
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where 
5 = ()(0)(3)(-X)(19)(26)(45)(78). (16) 


The monomial group generated by S, V, T’ and A will be denoted by M1; 
we have shown that 


Aut ($2)f 2 Mi». 


Definition. The Mathieu group M» is the permutation group on 2 obtained by 
ignoring the signs in Mi». Thus M; is generated by S, V, T and A, where T 
sends i to — 1/i. Clearly Mi; is isomorphic to Miz. 


By imitating the proof of Theorems 2 and 4 we obtain Theorems 17 and 18. 


Theorem 17. M; is 5-fold transitive, and has order 12: 11: 10-9- 8 = 95040. 


Theorem 18. M; is the full automorphism group Aut ($,;)t. 


Corollary 19. Aut (£,)t is isomorphic to the Mathieu group M,, consisting of 
the permutations in M» fixing a point of Q. M, is a 4-fold transitive group of 
order 11- 10-9. 8. 


Problem. (17) Show that the subgroup of M; fixing a dodecad setwise is 
isomorphic to Mi. 


$8. The Golay codes Y,, and 4,, are unique 
Theorem 20. Any (12, 35, 6) ternary code is equivalent to G2. 


Sketch of Proof. (1) By linear programming we find that the largest code of 
length 12 and distance 6 contains 3° codewords, and has the same distance 
distribution as 4,;. 

(2) As before (though with somewhat more trouble) it can be shown that 
the code is orthogonal to itself and hence must be linear. 

(3) The total number of ternary self-dual codes of length 12 is N.= 
2*-5- 7-41: 61 (Problem 21 of Ch. 19). There are two inequivalent [12, 6, 3] 
codes, namely €, the direct sum of 3 copies of the [4, 2, 3] code #6 of Ch. 1, 
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and €,;, a code with generator matrix 


111 


111l. (17) 

1 - 1 - 1 - 
- 1 1 - 1 

Then | Aut (@,)| = 2° - 3(4!? and | Aut (€7)| = 2(3!)*4! From Lemma 3, there are 
N, = 2? - 121/23 - 341? = 2°-3-5?-7-11 codes equivalent to €,, and N2= 
2° - 12!/2(3!)* - 4! = 2*- ?- 7-11 codes equivalent to @,. Finally, there are 
N3= 2" + 12!/2|Mi2| = 2’-3?-5-11 codes equivalent to $4; Since No= 
N,+N,+Ns3, all self-dual codes of length 12 are accounted for, and 4, is 
unique. Q.E.D. 


Corollary 21. Any (11, 3°, 5) ternary code is equivalent to G. 


Problem. (18) Show that the Steiner systems S(3, 4,10), S(4,5,11) and 
S(5,6, 12) are unique. [Hint: Witt [1424], Lüneburg [866].] 


Notes on Chapter 20 


The Golay codes were discovered by Golay [506]. Our sources for this 
chapter are as follows. For all of Sections 2, 3, 4 and 7, Conway [306]. 
Theorem 9 is due to Witt [1424], but the proof given here is from Curtis [322]. 
Another proof is given by Jónsson [699]. Lemma 10 is from Todd [1330]. The 
latter paper gives a list of all 759 octads. Problems 12, 13, 14, 16 are from Witt 
(op. cit.), but see also Liineburg [865, 866]. Theorem 14 was first proved by 
Snover [1247], but the proof given here is that of Delsarte and Goethals [363] 
and Pless [1054]. Theorem 20 is also from [363] and [1054]. For more about 
the enumeration of self-dual codes over GF(3) see Mallows et al. [892]. The 
counting method could also be used to prove that 4, is unique, although there 
are many more codes to be considered — see Pless and Sloane [1063]. 

The Mathieu groups were discovered by Mathieu in 1861 ([925, 926]) and 
have since accumulated a rich literature - see for example Assmus and 
Mattson [35,36], Berlekamp [120], Biggs [143], Conway [306], Curtis 
[322, 323], Garbe and Mennicke [466], Greenberg [559], Hall [584, 590], James 
[689], Jónsson [699], Leech [806], Lüneberg [865, 866], Paige [1018], Rasala 
[1095], Stanton [1264], Todd [1329, 1330], Ward [1388], Whitelaw [1412], and 
Witt [1423, 1424]. These groups are simple, i.e., have no normal subgroups (for 
a proof see for example Biggs [143]), and do not belong to any of the known 
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infinite families of simple groups. Many properties of M., can be obtained 
from 4, for example the maximal subgroups of M; can be simply described 
in this way (Conway [306], Curtis [323]. Wielandt [1415] is a good general 
reference on permutation groups. 

Finally, three very interesting topics not described in this chapter are: 

(i) Curtis’ Miracle Octad Generator, or MOG, [322], which miraculously 
finds the octad containing 5 given points in the S(5, 8, 24), 

(ii) Conway's diagram (Fig. 1 of [306], p. 41 of [322]) showing the action of 
M. on all binary vectors of length 24, and 

(iii) the Leech lattice (see Leech [804], Conway [302-306], Leech and 
Sloane [810]), which is a very dense sphere-packing in 24 dimensional space 
constructed from 4. The automorphism group of this lattice is very large 
indeed - see Conway [302, 303]. 


Association schemes 


§1. Introduction 


This chapter gives the basic theory of association schemes. An association 
scheme is a set with relations defined on it satisfying certain properties (see 
§2). A number of problems in coding and combinatorics, such as finding the 
largest code or the largest constant weight code with a given minimum 
distance can be naturally stated in terms of finding the largest subset of an 
association scheme. In many cases this leads to a linear programming bound 
for such problems (using Theorem 12). 

Association schemes were first introduced by statisticians in connection 
with the design of experiments (Bose and Mesner [183], Bose and Shimamoto 
[186], James [688], Ogasawara [1007], Ogawa [1008], Yamamoto et al. [1444], 
Raghavarao [1085]), and have since proved very useful in the study of 
permutation groups (see Notes) and graphs (Biggs [144—147] and Damerell 
[327]. The applications to coding theory given in this chapter are due to 
Delsarte (352-358, 361 and 364]. 

The chapter is arranged as follows. §2 defines association schemes and 
gives the basic theory. 883-6 discuss the three most important examples for 
coding theory, the Hamming, Johnson and symplectic form schemes. We will 
see that a code is a subset of a Hamming scheme, and a constant weight code 
is a subset of a Johnson scheme. Section 7 studies the properties of subsets of 
an arbitrary association scheme. Finally 888,9 deal with subsets of symplectic 
forms and with t-designs. 


82. Association schemes 


This section gives the basic theory. 





652 Association schemes Ch. 21. §2. 


Definition. An association scheme with n classes (or relations) consists of a 
finite set X of v points together with n+ 1 relations Ro, R,,..., Ra defined on 
X which satisfy: 


(i) Each R; is symmetric: (x, y) € R: > (y, x) € R. 

(ii) For every x, y € X, (x, y) € R; for exactly one i. 

(ii) Ro = {(x, x): x € X) is the identity relation. 

(iv) If (x, y) € R,, the number of z € X such that (x, z) € R; and (y, z) € R; 
is a constant cj, depending on i, j,k but not on the particular choice of x and 
y. 


Two points x and y are called i" associates if (x, y) € R. In words, the 
definition states that if x and y are i'" associates so are y and x; every pair of 
points are i" associates for exactly one i; each point is its own zeroth 
associate while distinct points are never zeroth associates; and finally if x and 
y are k'" associates then the number of points z which are both i" associates 
of x and j™ associates of y is a constant Cj. 

It is sometimes helpful to visualize an association scheme as a complete 
graph with labeled edges. The graph has v vertices, one for each point of X, 
and the edge joining vertices x and y is labeled i if x and y are i™ associates. 
Each edge has a unique label, and the number of triangles with a fixed base 
labeled k having the other edges labeled i and j is a constant Cix, depending on 
i, j, k but not on the choice of the base. In particular, each vertex is incident 
with exactly Cio = v; (say) edges labeled i; v; is the valency of the relation R.. 

There are also loops labeled 0 at each vertex x, corresponding to Ro. 

Examples of association schemes are given in the following sections. The 
most important for coding theory are the Hamming, Johnson, and symplectic 
form schemes. 

The ci satisfy several identities: 


Theorem 1. 
Cik = Cjik, Coin = Six, (1) 


UkCik 7 ViCkjis 


n 
> Cie = Ui, 
fo 
n n 
> CiimCmia = > CiniCikn. (2) 
m=0 h=0 


Proof. The last identity follows from counting the quadrilaterals 


j j 


j 
i k, eitheras i k oras [NC 
l Q.E.D. 
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We describe the relations by their adjacency matrices. D,, the adjacency 
matrix of R; (for i=0 ,n), is the v X v matrix with rows and columns 
labeled by the points of X. defined by 


1 if(x y)€R, 
0 otherwise. 


(D)., = { 


The definition of an association scheme is equivalent to saying that the D, 
are v X v (0, 1)-matrices which satisfy 


(i) D, is symmetric, 





(ii) € D; =J (the all-ones matrix), (3) 
i20 
(iii) Do I, (4) 
(iv) DD, = 2 Cii Dy = D,D,, ij = 0, eee 5 I. (5) 
=0 

Indeed the (x, y)" entry of the left side of Equation (5) is the number of paths 
xo gus y in the graph. Also the rows and columns of D, contain v; 

I's: 
DJ = JD. = vJ. (6) 


The Bose-Mesner Algebra. Let us consider the vector space & consisting of 
all matrices =f. a,D,, a; real. From (i), these matrices are symmetric. From 
(ii), Do, ..., D, are linearly independent, and the dimension of 54 is n * l. 
From (iv), s£ is closed under multiplication, and multiplication is com- 
mutative. Multiplication is associative (matrix multiplication is always as- 
sociative; alternatively, associativity follows from (2). This associative 
commutative algebra % is called the Bose-Mesner algebra of the association 
scheme. 

Since the matrices in are symmetric and commute with each other, they 
can be simultaneously diagonalized (Marcus and Minc [915]). Le. there is a 
matrix S such that to each A € & there is a diagonal matrix A4 with 


S'AS = Aa. (7) 


Therefore & is semisimple and has a unique basis of primitive idempotents 
Jo, ..., Ja. These are real n X n matrices satisfying (see Burrow [212], Ogawa 
[1008], Wedderburn [1393]) 


=J, i-0...,n (8) 
JJ. =0, iA k, (9) 


ŠJ =I (10) 
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From Equations (3), (6), (1/v)J is a primitive idempotent, so we shall always 
choose 


Now we have two bases for sf, so let's express one in terms of the other, say 


D=> pJ, k-0,...,n (11) 
i20 
for some uniquely defined real numbers p,(i). Equations (8), (9), (11) imply 
DJ; = p.(i)J.. (12) 
Therefore the p.(i), i 2 0,..., n, are the eigenvalues of D,. Also the columns 


of the J; span the eigenspaces of all the matrices IX. Let rank J; = m (say) be 
the dimension of the i" eigenspace, i.e. the multiplicity of the eigenvalue px(i). 

Conversely, to express the J, in terms of the D, let P be the real 
(n * 1) x (n * 1) matrix 


p«0) pO) --- p.(Q) 
Pol) p(l) +++ pal) 

BE | Preah Sh fete at as ly (13) 
poan) pi(n) ... p«(n) 


and let 


q«0) q(0) `- q«(0) 


q«1) gil) +++ q(1 
Q-wP^-| s. eet DI say), (14) 


qn) qi(n) *** q«(n) 


We call P and Q the eigenmatrices of the scheme. Then 


A-lYa)D, k-0,...,n. (15) 
t=O 


Theorem 2. 


Poi) = q«i) = 1, Px(0) = vr, qr (0) = Hr. 


Proof. Only the last equation is not immediate. Since J, is an idempotent the 
diagonal entries of S^'J,S (see Equation (7)) are 0 and 1. Therefore 


trace S JS = trace J. = rank Jk = gu. 
Since trace D; = vôo, (15) implies pr = q.(0). Q.E.D. 
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Problem. (1) Show that |p.(i)|< vr 


Theorem 3. The eigenvalues p.(i) and q,(i) satisfy the orthogonality con- 
ditions: 


2. mip.) pi) = vu. 8u, (16) 
È vig (i)a (i) = vpubu. (17) 

Also 
papiG) = ogi), ij=0,...,n. (18) 


In matrix terminology these are 
P"A,P = vå., (19) 
Q7A,Q = vA,, (20) 
where A, = diag {vo, v;,..., un}, 4, = diag {wo, uus... Unt: 


Proof. The eigenvalues of D,D: are p.(i)pi(i) with multiplicities uu. From (5), 
vv.Ou = trace D:D: = 5 MiP. (i) pi (i), 
i=0 


which proves (16) and (19). From (19), 
Q=vP'=A,'P7A, 
which gives (17), (18) and (20). Q.E.D. 


Corollary 4. 


pi(s)pi(s) = > Cipx(s), s =0,...,n. 


Proof. Equate eigenvalues in (5). Q.E.D. 


An isomorphic algebra of (n 1) x (n +1) matrices. We briefly mention that 
there is an algebra of (n + 1) x (n+ 1) matrices which is isomorphic to #, and 
is often easier to work with. Let 
Cioo Cito °° " Cino 
Is Cia Cin *** Cini . edo. 
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Then Equation (2) implies E 
LL; = > Cin. (21) 
k=0 


Thus the L; multiply in the same manner as the D. Since Cro = dx, it follows 
that Lo,..., La are linearly independent. Therefore the algebra B consisting 
of all matrices Efo aL, (a; real) is an associative commutative algebra, which 
is isomorphic to sf under the mapping D, > L,;. 

From Corollary 4 


PL,P™' = diag(p.(0), . . . , p.(n)). (22) 


which implies that the p,(i) (i —0,...,n) are the eigenvalues of Ly. (Al- 
ternatively since 4% and 9» are isomorphic, D, and L, have the same 
eigenvalues.) Since L, is much smaller than D, it is sometimes easier to find 
the p,(i) in this way. 

Now it is time for some examples. 


83. The Hamming association scheme 


The Hamming or hypercubic association scheme is the most important 
example for coding theory. In this scheme X = F^", the set of binary vectors 
of length n, and two vectors x, y € F" are i" associates if they have Hamming 
distance i apart. Clearly conditions (i), (ii) and (iii) of the definition of an 
association scheme are satisfied. Problem 19 of Ch. 1 shows that (iv) holds 


with 
/ gs) 
P-PEEID DTE] nsi p 
Cik = 2 2 ifi+j—k is even, 


0 if i+j—k is odd. 


Also v —| X | - 2" and v; = (4). The matrices in the Bose-Mesner algebra % are 
2"^x2" matrices, with rows and columns labeled by vectors x € F". In 
particular the (x, y)" entry of D, is 1 if and only if dist (x, y) =k. 

For example, let n — 3 and label the rows and columns of the matrices by 
000, 001, 010, 100, O11, 101, 110, 111. The graph which has edges labeled 1 is of 


course the cube (Fig. 21.1). 


det op 


100 
Fig. 21.1. 


D,= 
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The diagonals of the faces of the cube are labeled 2, and the four main 
diagonals are labeled 3. The adjacency matrices are D, = I, 


01110000 00001110 00000001 
10001100 00110001 00000010 
10001010 01010001 00000100 
10000110 D 01100001 D 00001000 
01100001P 27110000110]* 357100010000 
01010001 10001010 00100000 
00110001 10001100 01000000 
00001110 01110000 10000000 
Note that 
D? = 31 - 2D,, 
D,D;-2D; + 3Ds, (23) 
Cno 7 01 = 3, Cii 7 O, C272, Cas = 0, C220 = V2 = 3, C330 = V3 = 1, and so on. 
Theorem 5. In a Hamming scheme the primitive idempotent J,(k =0,..., n) is 
the matrix with (x, y)^ entry 
x» Ones (24) 
wt (z)-k 


The eigenvalues are given by 
Phi) = q«() = Pri; n), (25) 


where P,(x; n) is the Krawtchouk polynomial defined in Equation (14) of Ch. 
5. 


Proof. Let A, denote the matrix with (x, y)" entry (24). We show 
le F 
An = 50 2 P; n)D, Q9 


so A, € sf. We then show (8)-(10) hold, so the A, are the primitive idem- 
potents. Then (26) implies q,(i) = P.(i; n), and from (18) (and Theorem 17 of 


Ch. 5) 
pati) = ado (7)/ (F) 
= P,(k; ™(z)/(7) = Piin). 


To prove (26), Problem 14 of Ch. 5 implies 


(Ai), 7 3 Padwt (x + y); n), 
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hence 


At = Ts P,(i; n)D,. 
2 i-0 
The (x, y)" entry of AA; is 


ja >> > ( -pe v > (- Dew w 


er^ PM w)-i 


which simplifies to 


(- Dew v 
> 
wt(v)=k 


the (x, y)" entry of 84A. and proves (8), (9). (10) follows easily. Q.E.D. 
Equation (16) now becomes the familiar orthogonality relation for 


Krawtchouk polynomials (Theorem 16 of Ch. 5). 
For example, when n = 3, Jo — iJ, 


























3 1 1 1-1 -1 -1 -3 3-1-1-1-1-1-1 3 
1 3-1-1 1 1-3-1 1 3-1-1-1-1 3-1 
1-1 3-1 1-3 1-1 -1-1 3-1-1 3-1-1 
Ji-à| 1-1-1 3-3 1 1-1 J=; |-1 -1 -1 3 3-1-1-1 
-1 1 1-3 3-1-1 1 -1-1-1 3 3-1-1-1 
1-3 1-1 3-1 1 —-1-1 3-1-1 3-1-1 
-1-3 1 1-1-1 3 1 -1 3-1-1-1-1 3-1 
-3-1-1-1 1 1 1| 3 3-1-1-1-1-1-1 3 

1-1-1-1 1 1 1-1 

-) 1 1 1-1-1-1 1 

-] 1 1 l!-1-1-1 1 

_,{-l 1 1 1-1-1-1 1 

is 1-1-1-1 1 1 1-1 

1-1-1-1 1 1 1-1 

1-1 -1-1 1 1 1-1 

-)] 1 1 1]-1-1-1 1 

Then 


T=Jo +Ji+ J+J, 
Di =3Ja+ J, — J2;— 3Js, 
D, = 3Jo— Ji ~ Ji 3J,, 
D,=Jo —J,+J2—Js, 
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and 
13 3 1 
|I 1-1-1 
FS iets bare 
1-3 3-1 
0100 00101] 0001 
3020 0203 0010 
L= 0203 La 3020” L, 0100 
0010 0100 1000 


Problems. (2) Verify that the eigenvalues of L, are the (i+ 1)* column of P. 
(3) Find P when n = 4. 


84. Metric schemes 


The Hamming scheme and others given below are examples of metric 
schemes, which are defined by graphs. 

Let I' be a connected graph with v vertices, containing no loops or multiple 
edges, and let X be the set of vertices. The distance p(x, y) between x, y € X 
is the number of edges on the shortest path joining them. The maximum 
distance n (say) between any two vertices is called the diameter of I. 


Definition. The graph T is called metrically regular (or perfectly- or distance- 
regular) if, for any x, y € X with p(x, y) ^ k, the number of z € X such that 
p(x, z) ^ i and p(y, 2) = j is a constant cy independent of the choice of x and 


y. 


Clearly we obtain an association scheme with n classes from a metrically 
regular graph of diameter n by calling x, y € X i" associates if p(x, y) =i. 
Association schemes which can be obtained in this way are called metric 
schemes. To recover the graph from the scheme, define x and y to be 
adjacent iff (x, y) € Ry. 

For example the Hamming scheme is metric, and the graph is the skeleton 
of a unit cube in n-dimensions (see Fig. 21.1). 

Metrically regular graphs of diameter two are called strongly regular 
graphs. 


Problems. (4) Show that any association scheme with two classes is metric 
(and is obtained from a strongly regular graph). 
But not all association schemes are metric. 
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(5) Show that a scheme is metric iff (1) cii, #0 and (ii) cijZ0 i- 1j 
i+ 1. (Hint: (only if) Let I" be the graph defined by R,, and show p(x, y) = i iff 
(x, y) € Ri] 


Thus if an association scheme is metric then for k = 1,2,... 
DiD, = Ciri Desi + CuDy + Cua Dia, (27) 


and hence D, is a polynomial in D, of degree k. s/ is the algebra of 
polynomials in Di, and all the eigenvalues are determined when the eigen- 
values of D, (or L;) have been found. In the above example 2L. = Li- 3I 
(from (23)), so if a is any eigenvalue of L., (a? — 3)/2 is an eigenvalue of L.. In 
general, from (27) 

piG)pkG) = Cuecapia(l) + eupiea(l) + Cepia), (28) 


for i=0,...,7. 


Problem. (6) Prove that if D, is a polynomial in D, of degree k then the 
scheme is metric. 


The most interesting property of a metric scheme is that the eigenvalues 
p«(i) are obtained from a family of orthogonal polynomials. 


Definition. An association scheme is called a P-polynomial scheme if there 


exist nonnegative real numbers z=0, z,...,z, and real polynomials 
dz), Pi(z),..., (z), where deg d(z) = k, such that 
pli) = d«(z), i,k =0,...,n. (29) 


Theorem 3 implies that the ®,(z) are a family of orthogonal polynomials: 
Y udz)6,) = vv. (30) 
i=0 

A Q-polynomial scheme is defined similarly. 


For example, a Hamming scheme is both a P- and Q-polynomial scheme 
(with z; = i, (z) = P.(z; n)). 


Theorem 6. (Delsarte.) An association scheme is metric iff it is a P -polynomial 
scheme. 


Proof. lf the scheme is metric, then p«(i) is a polynomial in p,(i) of degree k. 
Set z = v; - p (i). Then there exist polynomials d(z) such that p,(i) = d(z) 
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and deg ®,(z) =k. Also zo— 0 and z,20 by Problem 1. The z are distinct 
because z = z implies px (i) = px(j), k =0,..., n, which is impossible since P 
is invertible. Hence the scheme is P-polynomial. 

Conversely, suppose the scheme- is P-polynomial, and write @,(z) = 
b —(1/a)z (a>0). Then p,(i) = (z) and p,(0)=v, imply b =v and z = 
a(v; — pi(i)). Since the ®,(z) are orthogonal they satisfy a 3-term recurrence 
(Szegó (1297, p. 42]) say 


aka ,4(z) = (Bx — z) d. (z) — vuoi (z) 


where ax.,>0. (The coefficients must satisfy various conditions that do not 
concern us here.) Evaluating this recurrence at z; gives 


— DZ) = aca) ~ Bie) + yide), 
PiC) pe’) = Cupra) + CuPc) + Cus-apia(D, 


for suitably defined cj. Problem 5 now implies the scheme is metric. Q.E.D. 


Research Problem (21.1). Give a similar characterization for Q-polynomial 
schemes. : 


*85. Symplectic forms 


In this section we construct the association scheme of symplectic forms, 
essential for studying subcodes of the second-order Reed-Muller code (see 
Ch. 15). 

The set X consists of all binary symplectic forms in m variables, or 
equivalently all m x m symmetric binary matrices with zero diagonal (82 of 
Ch. 15). Thus |X |= v - 27", We define (x, y) € R; iff the rank of the 
matrix x + y is 2i, for i —0,1,..., n =[m/2]. The number of symplectic forms 
of rank 2i, v, is given by Theorem 2 of Ch. 15. Let D, be the adjacency matrix 
of R. 

We shall prove that this is an association scheme by showing that (3)-(5) 
hold, and at the same time construct the Bose-Mesner algebra x. The 
treatment is parallel to that of the Hamming scheme, but uses Gaussian 
binomial coefficients with b = 4 (see Problem 3 of Ch. 15). 

For matrices x = (xy), y = (yj) € X, define the inner product 


Xy 


Gy) -CcnD^^7 Q1) 


Problem. (7) Show that 
>, (x, y) PF vÓo,. 
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Theorem 7. If x has rank 2i then 
k . use. EP 
> (x, y) = > = peel n EIE : ‘le’, (32) 
rank y = 2k j-0 n—k J 


where b =4 and c = 2” (m odd) or c =2"~' (m even). 


Sketch of Proof. We first show that 
(x, y) 


rank (y) - 2k 


depends only on the rank of x. If R is an invertible m x m matrix, the map 
y > RyR' is a permutation of the set of symplectic matrices which preserves 
rank. Hence 


(y) X (x, RyR?). 


rank (y) -2k rank (y) - 2k 


Also (x, RyR™) = (RxR', y). By Theorem 4 of Ch. 15 there is an R such that 


=N; (say), (33) 





where 


[01 
es [ o] 
and there are i = rank (x) G's down the diagonal. Then 
y= Y (Ney) =p) (say), 


rafk (y)= 2k rank (y)= 2k 


depending only on i,k and m. 
The next step is to prove a recurrence for pi”(i): 


pt"i)-1, — pim@)=32"-N2""'—D, (34) 
Pei) = pi (G —1)- 277 7! pico G — 1). (35) 
(34) follows from Theorem 2 of Ch. 15, and then by induction on i 
pia) e 107-3277 € 10. 36 
Let x = N;, and define M, to be the matrix obtained from N; by changing the 


00 i 


Ch. 21. §5. Symplectic forms 663 


Then 
peri -D-peread= E (M, y)-(N, y») 


rank (y)=2k 


, (Ma X1 = C 9) 


rank (y)=2 


where yn is the appropriate entry in y, 
= 2 5 (Mi, y). 


rank (y)=2k 
and yp= i 


Now (M, y) x (A, D) where A, D are of size (m —2) x (m — 2), and 


G Z Zu tt 2x 
- Z = ü = : 37 
y E, 5h s ert 22m-2 kd DOE ( ) 


If 


L«[* ji then L'yL o rl 


where F = D+ Z'GZ is an (m —2) X (m — 2) symplectic matrix of rank 2k —2. 
Thus 
po (G-y-pr)-2 > (A, F + Z'GZ). 


rank (F)=2k-2, 
aiz 


Now Z'GZ is either 0 or of rank 2. By Problem 8 the number of Z such that 
Z'GZ =0 is 3.2"? —2, and the number of Z such that Z’GZ is a particular 
matrix of rank 2 is 6. Thus 


prXi-l)-ped)-G27'-4  * (AF) 


rank(F)-2k—-2 


+12 (A, F) (A, E) 


rank (F)=2k~2 rank (E)=2 
= (3.27 — Aypiri?(i — 1) + 12p it ?Gi — Dp i- 1). (38) 


Combining (36) and (38) gives the recurrence (35). Finally (32) is the solution 
to (34) and (35); we omit the details. Q.E.D. 


The RHS of (32) will be denoted by p,(i), for it will turn out that these are 
the eigenvalues of the scheme. 
Define matrices J,, k —0,...,n, by 


J=} X G*»2. G9) 


An argument similar to that used in the proof of Theorem 5 shows that the J, 
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satisfy (8)-(10) and hence are primitive idempotent matrices, and that 
1< : 
J= 2: 2 px(i) D; (40) 


where p,(i) is given by (32). 


Lemma 8. 
vipx(i)  vipi(k). 


Proof. Consider in two ways the sum 
A (x, y). Q.E.D. 


rank x=2i rank y —2 


Theorem 9. (The orthogonality relation.) 


> vipr(i)pCi) ia vu. 


i=0 


Proof. Consider in two ways the sum 


>: 0s YX 2), 


X€X rank y=2k rank z=2 


and use Problem 7. Q.E.D. 


From Lemma 8 and Theorem 9, 
È Pe(i)p() = vdu, 


hence the matrix P with (i, k)" entry p,(i) satisfies P’ = vl, P '' = vP. There- 
fore (40) implies 


D, = > PDJ: (41) 


Let %4 be the commutative algebra generated by the Ją. From (10) this has 
dimension n + 1, and from (40), (41) the D, are also a basis. Therefore (3)-(5) 
hold, and this is an association scheme. From (40), (41) the eigenvalues 
px(i) = &(i) of (11), (15) are indeed given by (32), as claimed. 


Problems. (8) Show that (i) the number of Z, given by Equation (37), such that 
Z'GZ =0 is 3.2"™-—2; and (ii) if rank (B)=2, there are 6 Z such that 
Z'GZ =B. 
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(9) Show that p,(i) is a polynomial of degree k in the variable b^', and 
hence from Theorem 6 that this is a metric association scheme. 


$6. The Johnson scheme 


Our third example is the Johnson or triangular association scheme. Here X 
is the set of all binary vectors of length / and weight n, so v = |X |= (i). Two 
vectors x, y € X are called i associates if dist (x, y) = 2i, for i —0,1,...,n. 


Problem. (10) Show that this is an association scheme with n classes, and find 
the ci. Show that 


v = (PET). 


Theorem 10. The eigenvalues are given by 


p.(i) = Ei), — qw) = P Elk), (42) 
where 
_1-2i+1/1 
wc) m 


and E,(x) is an Eberlein polynomial defined by 


E,(x) -Xcvw( Ve rt k -0,...,n. (44) 


For the proof see Yamamoto et al. [1444] and Delsarte [352]. 


Theorem 11. (Properties of Eberlein polynomials.) 

(i) E(x) is a polynomial of degree 2k in x. 

(ii) E(x) is a polynomial of degree k in the variable z = x(l + 1 — x). Hence 
this is a P -polynomial scheme. 


(iii) 
ez c» 657) 
(iv) A recurrence: 
Ex) — 1, Ex) 7» n(l - n) - x(l*1— x), 
(k + 1V Er (x) = (Ex) - k(l -2k)HE (X) - (n -k- (0l -n—-k DE). 
For the proof see Delsarte [352, 361]. 


666 Association schemes Ch. 21. §7. 


Problem. (11) Let A; be the (Å x () matrix with rows labeled by the binary 
vectors x of weight n and columns by the binary vectors £ of weight i, with 


|. fl if£Cx 
(A). = n if not. 


Also let C; = A;A7. Show that 


Hence Co, ..., Cn is a basis for 5f. 


$7. Subsets of association schemes 


A code is a subset of a Hamming scheme. In this section we consider a 
nonempty subset Y of an arbitrary association scheme X with relations 
Ro, Ri,..., Rn 

Suppose | Y |= M >0. The inner distribution of Y is the (n + 1)-tuple of 
rational numbers (Bo, B,,..., Ba), where 


B; = BIR Y’| 


is the average number of z € Y which are i" associates of a point y € Y. In 
the Hamming scheme (Bo,..., Bn) is the distance distribution ($1 of Ch. 2) of 
the code Y. 

Of course B,- 1, 


B,=0, i=1,...,n, (45) 
and 
Bo+ B,+:::+B,=M. (46) 


Delsarte has observed that certain linear combinations of the B; are also 
nonnegative: 


Theorem 12. (Delsarte.) 


" l Š : 
Bi 47 2, q.(i) B; = 0 (47) 
fork =0,...,n, where the q,(i) are the eigenvalues defined in 82. 


Proof. Let u be a vector indicating which elements of X belong to Y: 


ue=1 if xE Y, u.—-0 if x£ Y. 
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Then 
Biscupat 
CUM uu, 
Bi =F u( > a G)D))u*, 
M -uJu*, from (15). (48) 


Since J, is idempotent, its eigenvalues are 0 or 1, and hence J. is nonnegative 
definite. Therefore Biz 0. Q.E.D. 


For example, applying Theorem 12 to codes we have another proof of 
Theorem 6 of Ch. 5. Applying it to the Johnson scheme we obtain the results 
about constant weight codes used on page 545 of Ch. 17. In general, Equation 
(46) makes it possible to apply linear programming to many problems which 
involve finding the largest subset of an association scheme subject to con- 
straints on the B, and Bi. Such a problem can be stated as: Maximize 
Bo B, ++: Bs (= M), subject to (45), (47) and any additional constraints. 
For examples see pages 538 and 546 of Ch. 17. 


*$8. Subsets of symplectic forms 


Let X be the set of symplectic forms in m variables (see 85), and Y a 
subset of X which has the property that if y, y' € Y then the form y + y’ has 
rank at least 2d. Such a set Y is called an (m, d)-set. In this section we derive 


an upper bound on the size of an (m, d)-set. 
Let Bo, Bi...,B., n«[m/2] be the inner distribution of Y. Then 


B, =: B: 0. 


Theorem 13. (Delsarte and Goethals.) For any (m, d)-set Y, |Y|c"^^", 
where c was defined in Theorem 7. 


Proof. It was shown in $5 that for the association scheme of symplectic forms 
k . x 
T yape n —J pn - iL 
qui) 2 ( 1)*"'b f BEA | j le 


Using parts (b), (c), (g) of Problem 3 of Ch. 15, this implies 


SE Jeo- [r] j=0,...,n. (49) 
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Then 
Bi = È qx (i) B,, 


PEEL LP 


so this becomes 


n D ES n-—k LN n n-d«t 
[an e+ Z bz] eL i 


Also, from Theorem 2, 
Therefore 


e a de mo 


By Theorem 12 B;z 0, hence |Y|<c"™4*'. 
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(50) 


(51) 


Q.E.D. 


This theorem shows that the sets of symplectic forms described in Ch. 15 


are indeed maximal sets. 


Theorem 14. If an (m,d)-set Y is such that |Y|=c"~**', then the inner 


distribution of Y is given by 
n-d > 
B, i S > (- yo een 1) 
i^i 


fori «9,L,...,n—d. 


(52) 


Proof. If Y attains the bound every term on the LHS of (51) is zero, i.e. 


Bi-|Y|-c7*", — B-0, k=l, n-del 


Now for j =0,...,n 
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For j «n — d all terms on the left except the first vanish, and using (49) this 


becomes 
[sent - e lá ‘|B. j=0, ned 
J izol J 

But Bo —::: = Ba- = 0, so 

n~d n~d 

> [:]s... => Ha 1). 

s=0 LJ soL] 
The stated result now follows from Problem 3g of Ch. 15. Q.E.D. 


This theorem gives the distance distributions of the codes constructed from 
maximal (m, d)-sets in Ch. 15. 


Problem. (12) (Kasami, [727]. Use Theorem 14 to show that if m > 5 is odd 
then the dual of the triple-error-correcting BCH code has parameters 


[2" = l, 3m, ga — 2+2] f 
and weight distribution 
i A; 
0 l 
Imega 4, Q- gm- E pon — pnt — 1) 


zm- -20-n $ JEN (J nia = 1)(2” ae 1Y5.27^ +4) 
pa 2" -1)9 2777327? I. 


~ we we 


Remark. For even m this method doesn’t work. However in this case the 
weight distribution has been found by Berlekamp ([113, Table 16.5], [114] and 
[118]). The dual of the extended triple-error-correcting BCH code has 
parameters 


[N =2", 3m + 1,27 —2???| for even m > 6, 
and weight distribution 
i A 
0,2” 1 
2m- 4 20i N(N — 1)(N — 4)/960 
7N*(N — 1)/48 


2N(N — DGN + 8)/15 
(N — 1)(29N?—4N + 6432. 


2m-! + 2m2 
mya + Qem-2i2 


gm=i 
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Problem. (13) Apply construction Y1 of §9.1 of Ch. 18 with @ equal to a 
[2^ **, 2"*' -3m —4,8] extended triple-error-correcting BCH code, and obtain 
codes with distance 7, redundancy 3m +2, and length 2" --2"*?? — 1 if m is 
even, or length 2" + 2°"*»? — 1 if m is odd, for all m 73. 


89. t-Designs and orthogonal arrays 


A subset Y of X is called a t-design if Bi =- = B; — 0. The justification 
for this definition comes from: 


Theorem 15. A subset Y in the Johnson scheme is a t-design iff the vectors of 
Y form an ordinary t-(l, n, A) design for some A. 


Proof. Note from (48) that in any association scheme Y is a t-design iff 
uj’ =0 fork-l,...,t (53). 


where u is the indicator vector of Y as defined in the proof of Theorem 12. 
Suppose that the vectors of Y form a t-(l, n, à) design. For 1 <i =t let A; 
and C, be as in Problem 11. Then A/u" is a column vector in which the £^ 
entry is the number of vectors in Y which contain the vector £ of weight i. 
Since the vectors of Y also form an i-(l, n, 4) design, with A; =| Y (0/0, 


1 


1 1 
Aiu™ =A, : - Trl Ar : 
1 1 


Since C; = AAT, 
1 
Cut Lr c|!]. (54) 
1 


As the C; are a basis for £, (54) holds for all the matrices in %, and in 
particular: 


T Y 1 
Jiu - Il, 


But l 


l 


Jo= oy 


J and JiJo=0 (k > 0); 
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hence 
Ju™=0 forlskst, 
uJu™=0 forlsk st, 


and Y is a t-design in the association scheme by (53). 

Conversely, if Y is a t-design in the association scheme, uJ," = 0, hence 
Ju" — 0 since J, is nonnegative definite. Reversing the above proof shows 
that Y is an ordinary t-design. Q.E.D. 


For example, let Y consist of the 14 codewords of weight 4 in the [8, 4, 4] 
Hamming code. The inner distribution and its transform are 


Bo= 1, B, = 12, B,=1, 
Bi = B} = B= 0, Bi = 56, 


and indeed these codewords form a 3-(8, 4, 1) design. 


Theorem 16. Y is a t-design in the Hamming scheme iff the vectors of Y form 
an orthogonal array of size | Y|, n constraints, 2 levels. strength t and index 
IY uz. 


Proof. See Theorem 8 of Ch. 5 and 88 of Ch. 11. Q.E.D. 


Notes on Chapter 21 


81. Association schenies were introduced by Bose and Shimamoto [186] and 
the algebra % by James [688] and Bose and Mesner [183]. For further 
properties see Yang [103, 104], Blackwelder [154], Bose [178], Yamamoto et 
al. [1422-1444], Wan [1458] and Wan and Yang [1459]. 


The group case. An important class of association schemes arise from 
permutation groups. Let @ be a transitive permutation group on a set X 
containing v points. X is called a homogeneous space of 4$. Let 4. be the 
subgroup of permutations fixing x € X, and let Se= {x}, $,..., S, be the 
orbits in X under €,. Then @ is said to be of rank n 4 l. 

Define the action of 4 on X x X by g(x, y) = (g(x), g(y)), ge $ x, y e X. 


Then Xx X is partitioned into the orbits Ro—((x, x): xe X}, Ri,...,Rn 
under 4$, and there is a l-l-correspondence between the R; and the S, 
(X, Ro, ..., Ra) is a homogeneous configuration in Higman's terminology. The 


R, are relations on X which however need not be symmetric. If they are 
symmetric the configuration is called coherent, and forms an association 
scheme with n classes. The algebra % can be defined even if the R; are not 
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symmetric. & is the set of all complex v X v matrices which commute with all 
the permutation matrices representing the elements of Y and is called the 
centralizer ring of 8. Then & is commutative iff the R; are symmetric, in 
which case & coincides with the Bose-Mesner algebra defined in 82. Higman 
[652] has proved 


Theorem 17. & is commutative if n & 5 


For much more about this subject see for example Cameron [232], Hes- 
tenes [644], Hestenes and Higman [645], Higman [646-653], Koornwinder 
[776] and Wielandt [1415]. 

For example, the Hamming, Johnson and symplectic schemes are coherent 
configurations corresponding to the following groups (i) the group of the 
n-cube, of order 2"n!, (ii) the symmetric group S, and (iii) the group of all 
m X m invertible matrices (cf. Theorem 4 of Ch. 15). 

A combinatorial problem closely related to association schemes is that of 
finding the largest number of lines in n-dimensional Euclidean 5pace having a 
given number of angles between them - see Cameron et al. [233], Delsarte et 
al. [367], Hale and Shult [578], Lemmens and Seidel [812] and Van Lint and 
Seidel [856]. 


84. For metrically regular graphs see Biggs [144—147], Damerell [327], Doob 
[379], Higman [649] and Smith [1240-1242]. Strongly regular graphs are 
described by Berlekamp et al. [128], Bose [175], Bussemaker and Seidel [221], 
Chakravarti et al. [260,261], Delsarte [349], Goethals and Seidel [502, 503], 
Higman [646], Hubaut [672] and Seidel [1175-1177]. 


885,8 are based on Delsarte [364], which also gives the theory of symplectic 
forms over GF(q) (in which case B is a skew-symmetric matrix). The association 
scheme of bilinear forms over GF(q) is described by Delsarte in [355]. 


$6. The Eberlein polynomials are defined in [401]. They are closely related to the 
dual Hahn polynomials - see Hahn [574] and Karlin and McGregor [719]. The 
association scheme of subspaces of a vector space is closely related to the 
Johnson scheme - see Delsarte [357, 361]. 

A preliminary version of this chapter appeared in [1229]. 


Appendix A 


Tables of the best 
codes known 


§1. Introduction 


This Appendix contains three tables of the best codes known (to us). 
Figure | is a table of upper and lower bounds on A(n, d), the number of 
codewords in the largest possible (linear or nonlinear) binary code of length n 
and minimum distance d, for n x 23 and d <9. Figure 2 is a more extensive 
table of the best codes known, covering the range n < 512, d «29. Thus Fig. 2 
gives lower bounds on A(n, d). Figure 3 is a table of A(n, d, w), the number of 
codewords in the largest possible binary code of length n, distance d, and 
constant weight w, for n «24 and d « 10. 

The purpose of these tables is to serve as a reference list of very good 
codes, as bench-marks for judging new codes, and as illustrations of the 
constructions given throughout the book. We would greatly appreciate hear- 
ing of improvements (send them to N. J. A. Sloane, Math. Research Center, 
Bell Labs, Murray Hill, New Jersey, 07974). Many of the lower bounds in Fig. 
3 are extremely weak (or nonexistent). 

General comments. 

(1) It is enough to give tables for odd values of d only (or even values of d 
only) in view of Theorem 10a and Theorem 1a of Ch. 17. 

(2) A code given without reference is often obtained by adding zeros to a 
shorter code. (Thus A(n + 1), d) > A(n, d) and A(n + 1, d, w)= A(n, d, w).) 

(3) Some other tables worth mentioning are: Berlekamp’s table [113] of 
selected linear codes of length « 100 and their weight distributions; Chen's 
table [266] of all cyclic codes of length = 65; Delsarte et al.’s table [368] of 
upper bounds on A(n, d, w); Helgert and Stinaff's table [636] of upper and 
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16 
20° 


38°-40 
72°-80 12 
144°-160 24 
256 32° 
512 64 
1024 128 
2048° 256 
2560°-3276 256-340 
5120°-6552 512-680 
9728°-13104 1024-1288 
19456-26208 2048*-2372 
] 2560*—4096 
4096—6942 
8192-13774 
16384*-24106 128'-280 





Fig. 1. Values of A(n,d). 


Key to Fig. 1. 

a See p. 541 of Ch. 17. g See $7.3 of Ch. 18. 

b Constructed in $9 of Ch. 2. h A conference matrix code ($4 of Ch. 2). 
d A Hamming code, $7 of Ch. 1. i From Best et al. [140]. 


e See p. 538 of Ch. 17. j The Golay code 8, ($6 of Ch. 2). 
f The Nordstrom-Robinson code ($8 of Ch. 2). k Constructed by W. O. Alltop [26]. 


1 From a Hadamard code €, (83 of Ch. 2). 


lower bounds on linear codes of length < 127; Johnson's tables [695], [696] of 
upper bounds on A(n, d) and A(n, d, w); and McEliece et al.'s table [946] of 
upper bounds on A(n,d). Also Peterson and Weldon [1040] contains a 
number of useful tables. 


Research Problem (Al). An extended version of Berlekamp’s table would be 
useful, giving weight distributions of a number of the best linear and distance- 
invariant nonlinear codes of length up to 512. 
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FIGURE 2 
THE BEST CODES KNOWN OF LENGTH UP TO 512 
AND MINIMUM DISTANCE UP TO 29. 


FOR EACH LENGTH N AND MINIMUM DISTANCE D, 
THE TABLE GIVES THE SMALLEST REDUNDANCY 
R = LENGTH - LOG (NUMBER OF CODEWORDS) 
OF ANY KNOWN BINARY CODE. 


REMARKS. LAST REVISED AUGUST 12, 1976. THE CODES NELD 
NOT BE LINEAR. AN EARLIER VERSION OF THIS TABLE AP- 


PEARED IN [1225]. 


DISTANCE D = 3 DISTANCE D = 5 
(SEE SECT. 9 OF CH 2 AND (SEE SECT. 7.3 OF CH 18) 
SECT. 7.3 OF CH 18) 
N R TYPE REF 
N R TYPE REF 
7 8 6 LI (308] 
4- 7 3 HG [592] 9- 11 6.415 HD [819] 
8 3.678 SW [509] 12- 15 7 PR [1002] 
9 3.752 SW [509] 16- 19 8 XP | 1237] 
10- 11 3.830 sw [701] 20 8.678 X4 { 1237] 
12- 15 4 HG [592] 21- 23 9 LI 41378] 
16- 17 4.678 SW [1239] 2u- 32 10 B 1518] 
18- 19 48.752 SW ( 1239] 33- 63 11 PR { 1081] 
20- 23 4.830 SW ( 1239] 64- 70 12 Xu [1237] 
24- 31 5 HG {592} 71- 74 13 AL [633] 
32- 35 5.678 sw [1239] 75-128 14 B [1237] 
36- 39 5.752 SW [1239] 129-255 15 PR [1081] 
40- 47 5.830 SW [1239] 256-271 16 x4 [1237] 
48- 63 6 HG [592] 272-278 17 GB | 286] 
64- 71 6.678 SW [1239] 279-512 18 B [1237] 


72- 19 6.152 sW [1239] 
80- 95 6.830 SW [1239] 
96-127 7 HG [592] 
128-103 7.678 SW [1239] 
144-159 7.752 sw [1239] 
160-191 7.830 sw [1239] 
192-255 8 HG £592} 
256-287 8.678 SW (| 1239] 
288-319 8.752 SW [1239] 
320-383 8.83€ SW |, 1239] 
380-511 9 HG [592] 
512 9.678 sw [1239] 





Tables of the best codes known 


[ 308] 


[ 1239] 
[ 506] 


[715] 
[715] 
( 1094 ] 
[495] 


( 1237] 
[625] 
( 1470] 
[536] 
[495] 


( 1291] 
[631] 
[536] 


REF 
[ 308] 


[ 819] 
[ 819} 


[819] 
( 26] 
[625] 
(1087 ) 
( 1237] 


(113] 


(625] 
( 1237] 
[631] 
[741] 


(721) 
[633] 
( 1237] 
| 1225] 
[721] 
[633] 
( 1237] 
[1291] 
| 1291) 
[631] 


676 
DISTANCE D = 7 
N R TYPE REF 
10- 11 9 LI 
12- 15 10 RM 
16 10.830 CM 
17- 23 11 So 
24 12 
25- 27 13 DC 
28- 30 14 DC 
31- 35 15 LI 
36- 63 16 IM 
64 17 
65- 67 18 XC 
68- 72 19 LI 
73- 87 20 ZV 
88-128 21 GP 
129-255 22 IM 
256-257 23- 24 
258-265 25 GP 
266-311 26 SV 
312-512 27 GP 
DISTANCE D = 9 
N R TY PE 
13- 14 12 LI 
15 13 
16 13.415 HD 
17- 19 13.678 HD 
20 14.678 
21 15.415 HD 
22- 23 16 LI 
24- 26 17 LI 
27- 30 18 PT 
31- 35 18.415 Y2 
36 19.415 
37- 41 20 OR 
42- 45 21 QR 
46- 47 22 LI 
48~ 49 22.193 Y2 
50- 52 23 SV 
53- 73 24 B 
74 25 
75- 76 26 XQ 
77- 91 27 AL 
92-128 28 B 
129-135 29 XB 
136-142 30 XQ 
143-167 31 AL 
168-256 32 B 
257-265 33 G 
266-274 34 GP 
275-311 35 SV 
312-512 36 B 


( 1237] 
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DISTANCE D 


N 

16- 17 
18 

19 

20 

21- 23 
24 

25- 26 
27- 31 
32 

33- 35 
36- 47 
48- 50 
51- 63 
64- 67 
68 

69- 71 
72 

73- 77 
78- 89 
90- 96 
97-128 
129-135 
136-142 
143-149 
150-191 
192-256 
257-264 
265-272 
273-280 
281-311 
312-512 


R 


= 11 


TYPE 


LI 


REF 
[ 308] 


(819] 
[518] 
[819] 
( 1238] 
( 636] 


( 1237] 


[1225] 
[ 1470) 


( 1291] 
[ 1083A] 
( 286] 
( 536] 
( 1225] 
( 721] 
( 1291] 
[6317] 
(536] 
( 1225] 
[721] 
( 1291] 
(631] 
( 536] 
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DISTANCE D 
N R 
19- 20 18 
21- 22 19- 20 
23 20.415 
24 21 
25- 27 21.193 
28 22.093 
29 23 
30- 31 24 
32- 34 25 
35- 37 26 
38 27 
39- 43 28 
44- 45 29- 30 
46- 55 31 
56 32 
57- 63 33 
64- 70 34 
71- 73 35 
74 36 
75- 77 37 
78- 79 38 
80- 90 39 
91 40 
92- 99 41 
100-128 42 
129-135 43 
136-142 44 
143-149 45 
150-156 46 
157-191 47 
192-256 48 
257-264 49 
265-272 50 
273-280 51 
281-288 52 
289-327 53 
328-512 54 


Tables of the best codes known 


TYPE 


REF 
[ 308] 


[819] 
[102] 
(819] 
( 1238] 


[ 625A] 
[ 1047 ] 
[1225] 


[ 113] 


( 1237] 


( 1225] 
(721] 


(715] 
[722] 
( 1083A] 


( 286] 
( 1237] 
( 1225] 
( 721] 
( 1291] 
( 1291] 
(637] 
( 1237] 
[ 1225] 
( 721] 
( 1291] 
{ 1291] 
(637] 
( 1237] 


DISTANCE D 
N R 
22- 23 21 

24- 25 22- 23 

26 23.415 
27 24 

28 24.678 
29- 31 25 
32 26 

33 26.830 

34 27.752 
35 28 
36- 39 29 
u0- 41 30 
42- 44 31 
45- 47 32 
48- 50 33 
51- 55 34 
56- 63 35 
64- 72 36 

73- 74 37- 38 
75- 79 39 

80- 81 40- 41 
82- 87 42 

88- 90 43- 45 
91- 92 46 

93- 95 46.678 
96- 99 47 

100-101 47.678 
102-104 48 
105-128 49 
129-135 50 
136-142 51 
143-149 52 
150-156 53 
157-163 54 
164-191 55 
192-256 56 
257-264 57 
265-272 58 
273-280 59 
281-288 60 
289-296 61 
297-327 62 
328-512 63 


cv 


cY 


KS 
3B 
GP 
XB 
XQ 
GP 
GP 
GP 
Y1 
GP 
XB 
XQ 
GP 
GP 
GP 
Y1 
GP 


677 


REF 
[ 308] 


[819] 
[ 102] 
( 819] 


(819] 
( 1238] 
( 1225] 
( 1291] 
( 1237] 
( 859A] 
( 1237] 
( 113] 

[113] 

[744] 

( 1083A] 


[715] 
[715] 


( 1083A] 
[722] 
[715] 
L722] 
( 286] 
[536] 
( 1225] 
[721] 
( 1291] 
( 1291] 
( 1291] 
( 637] 
(536] 
( 1225] 
( 721] 
( 1291] 
[1291] 
( 1291] 
[ 637] 
[536] 


678 


N 


25- 
27- 


33- 


40- 


43- 
45- 
47- 


51- 
54- 


57- 
63- 
67- 
72- 
90- 
94- 


103- 
106- 
108- 


127- 
129- 
136- 
143- 
150- 
157- 
164- 
171- 
192- 
257- 
265- 
273- 
281- 
289- 
297- 
305- 
332- 


Tables of the best codes known Appendix A. §1. 
DISTANCE D = 17 DISTANCE D = 19 
R TYPE REF N R TYPE REF 
26 24 LI [308] 28- 29 27 LI [308] 
28 25- 26 30- 32 28- 30 
29 26.415 HD [819] 33 30.415 HD [819] 
30 27.415 34 31 CY [266] 
31 28 BV [102] 35 31.678 HD [819] 
32 28.415 HD [819] 36 32.415 HD {819] 
35 28.830 HD (| 819) 37- 39 32.678 HD [819] 
36 29.752 CM { 1238] 40 33.608 CM [1238] 
37 30.678 HD [819] 41 34.541 HD [819] 
38 31.608 CM [1238] 42 35.541 
39 32.541 HD [819] 43- 44 36 AX [26] 
41 33 SP [26] 45 37 
42 34 46- 48 38 HS [636] 
44 35 SP [1291] 49- 51 39 LI [717] 
46 36 XQ [721] 52 40 
49 37 LI [717] 53- 55 41 Y2 (1237) 
50 38 56- 59 42 DG [ 364) 
53 39 LI [715] 60- 61 43 B 
55 39.142 Y3 [1237] 62- 63 44 CY (266] 
56 40 Y1 [1237] 64- 72 45 CY [1083A] 
62 41 CY 1,266] 73- 76 46- 49 
66 42 XC [ 1237] 77- 83 50 Y1 [1237] 
71 43 Y1 [1237] 80-103 51 QR [715] 
89 uu QR [715] 104 52 
93 45- 48 105-107 53 DC (715) 
101 49 QR [715] 108-109 54- 55 
102 50 110-127 56 B 
105 51 DC [715] 128-129 57- 58 
107 52- 53 130-131 59 GP [1291] 
125 54 B 132-135 60 G ( 1291] 
126 55 136-139 61 XC { 1237] 
128 56 B [1237] 140-141 62- 63 
135 57 XB [1225] 142-143 64 NL [1291] 
142 58 XQ (721] 144-147 65 NL [1291] 
149 59 SP [1291] 148-151 66 GP [1291] 
156 60 GP [1291] 152-191 67 Y1 [637] 
163 61 GP [( 1291] 192-255 68 B 
170 62 GP ( 1291] 256-260 69 XB [1225] 
191 63 Y1 [637] 261-263 70- 72 
256 64 B [1237] 264-271 73 GP [1291] 
260 65 XB [1225] 272-273 74 GP [1291] 
272 66 xQ [721] 274-280 75 GP [1291] 
280 67 SP [1291] 281-288 76 GP [1291] 
288 68 SP [1291] 289-296 77 SP (1291) 
296 69 GP [1291] 297-304 78 SP [1291] 
304 70 GP [1291] 305-312 79 SP [1291] 
331 71 Y1 (637] 313-339 80 v1 [637] 


512 72 B [1237] 340-512 81 GP [536] 
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DISTANCE D 
N R 
31- 32 30 

33- 35 31- 33 

36 33.415 

37 34.415 
38 35 

39 35.678 

40 36.193 

41- 43 36.541 

44 37.541 

45 38.415 
46- 47 39 

48- 50 40- 42 
51- 53 43 

54- 57 43.415 
58- 61 44 
62- 63 45 

64- 67 46- 49 
68- 69 50 
70 51 
71- 72 52 
73- 71 53 
78 54 
79- 86 55 

87- 90 56- 59 
91- 95 60 
96 61 

97-101 61.678 
102-105 62 
106-127 63 
128-135 64 

136-138 65- 67 
139-142 68 
143-146 69 

147-148 70- 71 
149-150 72 
151-154 73 
155-158 74 
159-191 75 
192-255 76 
256-264 77 
265-268 78 

269-270 79- 80 
271-272 81 
273-279 82 
280-281 83 
282-288 84 
289-296 85 
297-304 86 
305-312 87 
313-320 88 
321-339 89 
340-512 90 


Tables of the best codes known 


= 21 
TYPE REF 
LI [308] 
HD [819] 
BV [102] 
HD [819] 
HD [819] 
HD [819] 
HD [819] 
SP [26] 
ZV [1470] 
y2 [1237] 
DG [ 364) 
B 
GP [1291] 
GP [1291] 
xc [1237] 
B [618] 
ZV [1470] 
zv [1470] 
Y1 [637] 
B 
XB [1225] 
GP (1291] 
xQ [721] 
NL [1291] 
NL [1291] 
xo [721] 
y! [637] 
B 
XB [1225] 
XQ (721] 
X3  CH.18 
GP [1291] 
GP [1291] 
GP [1291] 
SP [1291] 
GP [1291] 
GP [1291] 
GP ( 1291] 
Y1 [637] 
B [1237] 


DISTANCE D 
N R 
34- 35 33 
36- 38 34- 36 
39 36.415 
40 37.415 
41 38 
42 39 
43 39.415 
44 40 
45- 47 40.415 
48 41.356 
49- 50 42 
51- 52 43- 44 
53- 55 45 
56- 63 46 
64 47 
65- 66 48 
67- 72 49- 54 
73- 74 55 
75 56 
76- 87 57 
88- 91 58- 61 
92- 95 62 
96- 98 63- 65 
99-101 65.678 
102-104 66 
105-111 67 
112-113 68- 69 
114-127 70 
128-135 71 
136-142 72 
143-145 73- 75 
146-149 76 
150-153 77 
154-155 78- 79 
156-161 80 
162-167 81 
168-173 82 
174-207 83 
208-255 84 
256-264 85 
265-272 86 
273-276 87 
277-279 88- 90 
280-287 91 
288-289 92 
290-296 93 
297-304 94 
305-312 95 
313-320 96 
321-328 97 
329-347 98 
348-512 99 


tt 
N 
w 


679 


[1470] 
( 364] 


( 1225] 
[1470] 
[715] 

[ 1470] 


[1470] 
( 1470] 
( 1470] 


( 1225] 
( 721] 


( 1291] 
( 1291] 


[722] 
( 722] 
[722] 
(637] 


( 1225] 
( 721] 
( 1291] 


{ 1291] 
[1291] 
[1291] 
( 1291] 
[1291] 
( 1291] 
( 1291] 
( 637] 

[536] 


680 


DISTANCE D 


N 
37- 38 
39- 42 

43 

uu 

45 

46 

47 

48 

49~ 51 
52 

53 

54 

55 

56- 61 
62- 63 
64- 66 
67 

68- 71 
72- 73 
74- 75 
76- 78 
79- 81 
82 

83- 86 
87- 91 
92- 93 
94 

95- 99 
100-101 
102 
103-109 
110-125 
126-127 
128-135 
136-142 
143-149 
150-152 
153-156 
157-161 
162-167 
168-173 
174-179 
180-185 
186-191 
192-207 
208-255 
256-264 
265-272 
273-280 
281-284 


R 


56 
60 


68 


77 


83 
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= 25 
TY PE 
LI 
HD 
BV 
HD 
HD 
HD 
HD 
CM 
HD 


LI 
DG 


LI 


DG 


x 


PT 


B 


LI 


REF 
[ 308] 
[819] 


[102] 
[819] 
[819] 
(819) 
(819] 
( 1238] 
(819] 


[25] 
[758] 


[715] 
CH. 18 
[26] 
[1047] 
[637] 
CH. 10 


( 1291] 
[722] 


( 637] 


( 1225] 
(721) 
( 1291] 


( 1291] 
[722] 
[322] 
(722) 
(722) 
( 722] 
[722] 
[637] 


( 1225] 
(721) 

( 1291] 
( 1291] 
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285-287 97- 99 
288-295 100 
296-297 101 
298-304 102 
305-312 103 
313-320 104 
321-328 105 
329-336 106 
337-347 107 
348-512 108 

DISTANCE D 
N R 
40- 41 39 

42- 45 40- 43 

46 43.415 

47 44.415 
48 45 
49 46 

50 46.678 

51 47.193 

52 47.830 

53- 55 48.193 

56 49.193 

57 50.093 
58- 63 51 

64- 67 52- 55 
68- 73 56 

74- 77 57- 60 
78- 79 61 

80- 81 62- 63 
82- 84 64 
85- 88 65 

89- 93 66- 70 
94- 95 71 

96- 97 72- 73 
98- 99 74 
100-103 75 
104-111 76 
112-127 77 

128-129 78- 79 
130-131 80 
132-135 81 
136-139 82 

140-141 83- 84 
142-143 85 
144-147 86 
148-151 87 
152-156 88 

157-158 89- 90 
159-167 91 


[ 1291] 
( 1291] 
( 1291] 
( 1291] 
[ 1291] 
( 1291] 
( 1291] 
[ 637] 

( 1237] 


REF 
[ 308] 
{ 819] 
[266] 
[819] 
(819] 
(819] 
(819] 


[819] 
[758] 


CH. 18 
( 1291] 


( 1083A] 
(637] 


CH. 10 


[1470] 
Í 637) 
[ 637] 


( 1291] 
( 1291] 
( 1291] 
( 1291] 


(722] 
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168-173 
174-179 
180-185 
186-191 
192-197 
198-199 
200-207 
208-255 
256-264 
265-272 
273-280 
281-288 
289-292 
293-295 
296- 303 
304-305 
306-312 
313-320 
321-328 
329-336 
337-344 
345-352 
353-512 


57- 59 


65- 69 
70- 74 
75- 77 


Tables of the best codes known 


92 KS 
93 KS 
94 KS 
95 KS 
96 KS 
97- 98 
99 Y1 
100 B 
101 XB 
102 XQ 
103 GP 
104 GP 
105 GP 
106- 108 
109 GP 
110 GP 
111 GP 
112 GP 
113 S 
114 GP 
115 GP 
116 GP 
117 GP 
29 
R IYPE 
42 LI 
43- 46 
46.415 HD 
47.415 
48.415 
49 BV 
49.678 HD 
50.415 HD 
51.193 HD 
51.678 HD 
52.093 HD 
53.046 CM 
54 RM 
55 
55.913 HD 
56.913 
57 X 
58- 62 
62.752 ZV 


( 722] 
( 722] 
(722] 
i722] 
[722] 


[637) 


( 1225] 
(721) 

( 1291] 
( 1291] 
( 1291] 


[1291] 
( 1291] 
( 1291] 
( 1291] 
( 1291] 
( 1291] 
( 1291] 
( 1291] 
( 536] 


( 819] 
CH. 18 


( 1470] 


106-111 

112 
113-125 
126-127 
128-135 
136-138 
139- 142 
143-146 
147-148 
149-150 
151-154 
155- 158 
159-163 
164-167 
168-179 
180-185 
186-191 
192-197 
198-202 
203-207 
208-255 
256-264 
265-272 
273-280 
281-288 
289-296 
297-300 
301-303 
304-311 
312-313 
314-320 
321-328 
329-336 
337-344 
305-352 
353-360 
361-512 


66- 67 


73- 76 


101 
102- 106 
107 
108 
109 
110 
111 
112 
113 
114 
115-117 
118 
119 
120 
121 
122 
123 
124 
125 
126 
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( 1291] 
( 1291] 


( 1291] 
[ 1470] 
[1470] 


CH. 10 
[1470] 


[ 1470] 
[744] 
( 1225] 


( 1291] 
(721] 


( 1291] 
( 1291] 
( 1291] 
( 1291] 
(722] 
[722] 
( 722] 
( 722] 
[722] 


(631] 


( 1225] 
(721] 

( 1291] 
( 1291] 
( 1291] 
( 1291] 


( 1291] 
( 1291] 
( 1291] 
( 1291] 
( 1291] 
( 1291] 
( 1291] 
( 1291] 
( 1237] 


682 
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THE NUMBER OF CODEWORDS IN FIG. 2 


IF THE REDUNDANCY R = I + X WHERE I IS AN INTEGER AND 
X IS BETWEEN 0 AND 1, THE NUMBER OF CODEWORDS IS 

F.2** (N-I-5) 
WHERE F IS GIVEN BY 


X: 0 .046 .039 .142 .193 .245 300 .356 
F: 32 31 30 29 28 27 26 25 
X: .015 .476 .541 .608 .678 .752 .830 .913 
F: 24 23 22 21 20 19 18 17 


TYPES OF CODES IN FIG. 2 


ALTERNANT CODE (LINEAR) (CH. 12, [633]) 
THE !A+X!B+X!A+B+X! CONSTRUCTION (LINEAR) (SECT. 7.4 
OF CH. 18, [1237 ]) 

BCH OK SHORTENED BCH CODE (LINEAR) (CH. 3,7,9 ) 

BELOV ET AL.'S LINEAR CODES WHICH MEET THE GRIESMER 
BOUND (SECT. 6 OF Cn. 17, [102]) 

CONFERENCE MATRIX CODE (NONLINEAR) (SECT. 4 OF CH. 2, 
[1238] 

CYCLIC Ok SHOKTENED CYCLIC LINEAR CODE. 

DOUBLE CIRCULANT CODE (LINEAR) (SECT. 7 OF CH. 16, 
[715 }) 

DELSAKTE-GOETHALS GENERALIZED KERDOCK CODE, (NONLINEAR) 
(SECT. 5 OF CH. 15, [364 )) 

GENERALIZED BCH CODE (LINEAR) (SECT. 7 OF CH. 12, 
(286 }) 

GOLAY CODE (LINEAR) (SECT. 6 OF CH. 2, [506 ]) 

GOPPA OR MODIFIED GOPPA CODE (LINEAR) (SECT. 3 OF CH. 
12, [536], (1291]) 

HADAMARD MATRIX CODE (NONLINEAR) (SECT. 3 OF CH. 2, 
[819 }) 

HAMMING CODE (LINEAR) (SECT. 7 OF CH. 1, [592]) 
HELGEKT AND STINAFF^S CONSTRUCTION A (SECT. 9.2 OF CH. 
18, [(636]) 

GOETHALS NONLINEAR CODE I(M) (SECT. 7 OF Ch. 15, 
[495 ]) 

KASAHARA ET AL. S MODIFIED CONCATENATED CODES (LINEAK 
OR NONLINEAR) (SECT. 8.1 OF CH. 18, (722]) 

LINEAR CODE. 

NONLINEAR CODE. 

PREPARATA CODE (NONLINEAR) (SECT. 6 OF CH. 15, (1081]) 
PIRET’S CONSTRUCTION (LINEAR) (SECT. 7.5 OF Ch. 18, 
( 1047 )) 

QUADRATIC RESIDUE CODE (LINEAR) (CH. 16) 
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RM = REED-MULLER OR SHORTENED  REED-MULLER CODE (LINEAR) 
(CH. 13) 

SV = SRIVASTAVA CODE (LINEAR) (SECT. 6 OF CH. 12) 

SW = NONLINEAR SINGLE-ERROR-CORKECTING CODE (SECT. 9 OF CH. 
2, [1239]}) 

X = CONSTRUCTION X (LINEAR Ok NONLINEAR) (SECT. 7.7 OF CH. 


18, (1237) 

XB = APPLY CCNSTRUCTION X TO A BCH CODE (LINEAR) (SECI. 7.1 
Of CH. 18, [ 1237]) 

XC = APPLY CCNSTRUCTION X TO A CYCLIC CODE (LINEAR) (SECT. 
7.1 OF CH. 18, [ 1237]) 

XP = APPLY CCNSTRUCTION X TO A PREPAKATA CODE (NONLINEAR) 
(SECT, 7.1 OF CH. 18, (1237]) 

XQ = KASAHARA ET AL.'S EXTENDED BCH CODES (LINEAR) (PROBLEM 
14 OF CH. 18, (721]) 

X3 = CONSTRUCTION X3 (LINEAR OR NONLINEAR) (PROBLEM 14 OF 
CH. 18) 


X4 = CONSTRUCTION X4 (LINEAR OR NONLINEAR) (SECT. 7.2 OF 
CH. 18, (1237]) 

Y1 = CONSTRUCTION Y1 (LINEAR) (SECT. 9.1 OF CH. 18.) 

Y2 - CONSTRUCTION Y2 (NONLINEAR) (PROBLEM 29 OF CH. 18) 

Y3 = CONSTRUCTION Y3 (NONLINEAR) (PROBLEM 29 OF Cr. 18) 

Y4 = CONSTRUCTION Y4 (LINEAR) (”ROBLEM 30 OF CH. 18, (637]) 

Z = THE !U!U+V! CONSTRUCTION (LINEAR OR NONLINEAR) (SECT. 
9 OF CH. 2, [1239 )) 

ZV = ZINOVIEV’S CONSTRUCTION (LINEAR OR NONLINEAk) (SECT. 


8.2 OF CH. 18, [ 1470}) 


§2. Figure 1, a small table of A(n, d) 


Figure | gives upper and lower bounds on A(n,d). The Plotkin- 
Levenshtein theorem (Theorem 11 of Ch. 17) gives all codes in this figure on and 
above the line n = 2d + 1. Below the line n = 2d + 1 unless indicated otherwise 
the upper bounds are obtained by linear programming ($4 of Ch. 17) or Theorem 
10b of Ch. 17. The lower bounds are continued in Fig. 2. 


Research Problem (A2). Of course all the undecided entries in these figures are 
research problems. One particularly interesting code which might exist is a 
(20, 4096, 5) code, corresponding to A(20, 5) 2 4096. The weight distribution of 
the (21, 4096, 6) extended code would be, from linear programming ($4 of Ch. 
17): Bo=1, Be= 314, B= 595, Bw= 1596, B,— 1015, B= 490, Bio = 84, 
B5- 1. 





Distance 4: A(n, 4, w) 
2 3 4 5 6 7 8 9 10 11 12 


ON tA tA 4 4 O9 Q9 FN F2 
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Fig. 3. A table of A(n, d, w). For the latest versions of the tables of A(n, d, w) see “Lower bounds tor constant weight codes" by R. L. Graham and N. J. A. Sloane, 
IEEE Trans. Info. Theory, Vol. IT-26 (January 1980) 37—43. 
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Distance 6: A(n, 6, w) 
nf w 2 3 6 7 8 9 
6 1 2 1 
7 1 2 1 1 
8 1 2 1 1 
9 1 3 3 1 1 
10 1 3 5 3 1 
11 1 3 11 6 1 
12 1 4 22 12 4 
13 1 4 *26 26 13 
14 1 4 "42 “42-51 28 
15 1 5 "710 ‘60-88" 70 
"112 ‘9Q- '120- 90- 
-156 -150* -156 








232- 
-276 





'172- ‘228 i332- 
-228 -520" -739 


310- i492- 
-651 -1199" 
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Distance 6: A(n,6, w) 
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Distance 8: A(n, 8, w) 



































18 l l 4 9 i20- 133- 146— 46— 
-21 -4Al* -63 -63 
19 l l 4 12 28 i52- i78- 88- 
-57 -97 -122 
20 l l 5 16 40 80 i130- i176- 
-142 -244* 
21 5 21 156 120 210 i280- i336- 
-331 -399*- 
22 1 1 5 21° "77 1176 1330 '616- 
-728 
23 1 1 5 *23 77- i253 i506 616- 
-80 -1111+ 
24 l l 6 *24 77- 253- 9159 3960— 
-92 -274 -1639" 
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Distance 10: A(n, 10, w) 
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Key to Fig. 3 


a Theorems 6,7 of Ch. 17. 

b Shen Lin [831]. 

c H. R. Phinney [1043]. 

d A. E. Brouwer [201a]. 

e From Problem 6 of Ch. 17 and the Steiner systems S(5,6, 12), S(3, 5,17), S(3,6, 26), 
S(5, 6, 24), S(5, 7,28) (Chen [277], Denniston [372], Doyen and Rosa [386]). 

f From the nonexistence of Steiner systems S(4, 5, 15), S(4,6, 18) ([386)). 

g A cyclic code. 

h From the t-design 3-(16, 6, 4) (from the Nordstrom-Robinson code). 

i From translates of the Nordstrom-Robinson code (see Fig. 6.2 and [201a]). 

j From the [24, 12, 8] Golay code. 

k See Problem 1. 

L Linear programming bound ($4 of Ch. 17). 

m From a conference matrix. 

n A quasi-cyclic code. 

q Problems 7 and 8 of Ch. 17. 

r W.G. Valiant, personal communication. 

s D. Stinson, personal communication. 

t See [140]. 


$3. Figure 2, an extended table of the best codes known 


Let M be the size of the largest known (linear or nonlinear) binary code of 
length N x 512 and minimum distance D x 29. Figure 2 tabulates the redun- 
dancy 


R = N —log;-M 
of this code. 
Since the code may be nonlinear, R need not be an integer. A small table 


below Figure 2 makes it easy to find M given R. If R = I + X, where I is an 
integer and 0= X <1, the number of codewords is 


M-F.-2N75, 
where F is given in the table. For example, consider the entry 
N-8, R = 3.678, Type= SW, Ref. [509] 


near the beginning of the figure. Here I = 3, X = 0.678, F = 20 and the number of 
codewords is 


M = 20: 2°= 20. 


This is an (8, 20, 3) code, of type SW (see the list of types below the figure), 
found by Golay [509] - see §7 of Chapter 2. 

To save space, several entries have been compressed into one. For exam- 
ple, under Distance D= 11, the entry 


48-50 24-26 
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is a contraction of 


48 24 
49 25 
50 26. 


Shortened codes have been given the same name as the parent code. For 
example CY denotes a cyclic code or a shortened cyclic code. 


$4. Figure 3, a table of A(n, d, w) 


This table is important because it leads to bounds on A(n, d) (83 of Ch. 17), 
and in its own right for providing constant weight codes ($2 of Ch. 17) and as 
the solution to a packing problem. The packing problem is sometimes stated 
as follows. Find D(t,k, n), the maximum number of k-sets from an n-set S 
such that every t-set from S is contained in at most one k-set (see for 
example Gardner [467], Kalbfleisch et al. [708—711], Mills [957—959], Stanton 
[1265], Stanton et al. [1267-1271] and Swift [1295]. See also the football pool 
problem mentioned in Ch. 7. Since D(t, k, n) = A(n, 2k — 2t + 2, k), Fig. 3 is also 
a table of D(t, k, n). 

Unless indicated otherwise the upper bounds in Fig. 3 are from the 
Johnson bounds Theorems 1—4 and Problem 2 of Ch. 17, or from Problem 4 
of Ch. 17. Unmarked lower bounds are also from Theorem 4 and Problem 2 
of Ch. 17, or can be found by easy constructions. The small letters are 
explained in the key on p. 690. Letters on the left refer to lower bounds, on the 
right to upper bounds. There may be several ways to construct one of these 
codes, but only the simplest is mentioned. A slightly later version of this table 
will appear in Best et al. [140]. 


Problem. (1) 
Use the last row of Fig. 5 of Ch. 14 to show that A(16,8,6) = 16 and 
A(15,8, 6) = 10. 


Research Problem (A3). A 2-(17, 6, 15) design certainly exists (Hanani [600]). 
But is there a 2-(17,6,15) where the blocks form a code proving that 
A(17, 6, 6) = 136? 


Research Problem (A4). Some more constant weight codes which might exist, 
together with their distance distributions (found by linear programming) 
A(21, 4, 5) 2 1197 (Bo = 1, B, = 80, Be = 320, Bs = 540, Bio zd 256), 
A(16, 6, 8) 2 150 (Bo = 1, B. = 64, Bs = 20, Bio = 64, Bi, = 1). 


Research Problem (A5). Is it true that if w,<w.<in then A(n,d,w)< 
A(n, d, w2)? 





Appendix B 


Finite geometries 


§1. Introduction 


Finite geometries are large combinational objects just as codes are, and 
therefore it is not surprising that they have turned up in many chapters of this 
book (see especially Ch. 13). In this Appendix we sketch the basic theory of 
these geometries, beginning in §2 with the definitions of projective and affine 
geometries. The most important examples (for us) are the projective geometry 
PG(m, q) and the affine or Euclidean geometry EG(m, q) of dimension m 
constructed from a finite field GF(q). For dimension m > 3 there are no other 
geometries (Theorem 1). 

In $3 we study some of the properties of PG(m,q) and EG(m, q), 
especially their collineation groups (Theorem 7 and Corollary 9) and the 
number of subspaces of each dimension (Theorems 4-6). 

In dimension 2 things are more complicated. A projective plane is 
equivalent to a Steiner system S(2, n t 1, n? n + 1), for some n 72, and an 
affine plane to an S(2, n, n^) for some n x2 (Theorem 10). But now other 
kinds of planes exist besides PG(2, q) and EG(2, q) - see $4. 


$2. Finite geometries, PG(m, q) and EG(m, q) 


Definition. A finite projective geometry consists of a finite set (2 of points 
p.q, .. . together with a collection of subsets L, M,... of Q called lines, which 
satisfies axioms (i)- (iv). (If p € L we say that p lies on L or L passes through 
p.) 

() There is a unique line (denoted by (pq)) passing through any two 
distinct points p and q. 
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(ii) Every line contains at least 3 points. 

(iii) If distinct lines L, M have a common point p, and if q, r are points of L 
not equal to p, and s,f are points of M not equal to p, then the lines (qt) and 
(rs) also have a common point (see Fig. 1). 





Fig. 1. Axiom (iii). 


(iv) For any point p there are at least two lines not containing p, and for 
any line L there are at least two points not on L. 

A subspace of the projective geometry is a subset S of (2 such that 

(v) If p, q are distinct points of S then S contains all points of the line (pq). 

Examples of subspaces are the points and lines of Q and Q itself. A 
hyperplane H is a maximal proper subspace, so that £2 is the only subspace 
which properly contains H. 


Definition. An affine or Euclidean geometry is obtained by deleting the points 
of a fixed hyperplane H (called the hyperplane at infinity) from the subspaces 
of a projective geometry. The resulting sets are called the subspaces of the 
affine geometry. 

A set T of points in a projective or affine geometry is called independent if, 
for every x € T, x does not belong to the smallest subspace which contains 
T —{x}. For example, any three points not on a line are independent. The 
dimension of a subspace S is r—1, where r is the size of the largest set of 
independent points in S. In particular, if S = Q this defines the dimension of 
the projective geometry. 


The projective geometry PG(m,q). The most important examples of pro- 
jective and affine geometries are those obtained from finite fields. 

Let G F(q) be a finite field (see Chs. 3, 4) and suppose m > 2. The points of 
N are taken to be the nonzero (m + 1)-tuples 


(09,01,...,04), a; E GF(q), 


694 Finite geometries Appendix B. §2. 


with the rule that 
(ao,..., am) and (Aao,...,AQn) 


are the same point, where A is any nonzero element of GF(q). These are 
called homogeneous coordinates for the points. There are q"*'— 1 nonzero 
(m + 1)-tuples, and each point appears q — 1 times, so the number of points in 
Q is (q"*' - D)/(q - 1). 

The line through two distinct points (ao, ..., am) and (bo,..., bm) consists 
of the points 


(Ado t ubo, ..., Ads + Dim), (1) 


where A, u € GF(q) are not both zero. A line contains q + 1 points since there 
are q?— 1 choices for A, u and each point appears q — 1 times in (1). 
Axioms (i), (ii) are clearly satisfied. 


Problem. (1) Check that (iii) and (iv) hold. 
The projective geometry defined in this way is denoted by PG(m, q). 
Problem. (2) Show that PG(m, q) has dimension m. 


Examples. (1) If m = q=2, the projective plane PG(2,2) contains 7 points 
labeled (001), (010), (100), (011), (101), (110), (111), and 7 lines, as shown in 
Fig. 2 (cf. Fig. 2.12). 





(044) (440) b1] (404) 


Fig. 2. The projective plane PG(, 2). 
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(2) If m=2, q=3 we obtain the projective plane PG(2, 3), containing 
37+3+1= 13 points 
(001) (010) (011) (012) 
(100) (101) (102) (110) 
(a1) — (12 (120 (12D 
(122), 


and 13 lines, nine of which are shown in Fig. 3. 


(001) 





(042) (402) 


Fig. 3. The 13 points and nine of the 13 lines of the projective plane PG(2, 3). 


It is convenient to extend the definition of PG(m, q) to include the values 
m — — 1, 0 and 1, even though these degenerate geometries do not satisfy 
axiom (iv). Thus PG(- I, q) is the empty set, PG(0, q) is a point, and PG(1, q) 
is a line. 

A hyperplane or subspace of dimension m —1 in PG(m, q) consists of 
those points (do,..., 44) which satisfy a homogeneous linear equation 


Aodot Ardit" +H ÀS0, =0, A,€ GF(q). 


Such a hyperplane is in fact a PG(m —1,q9), and will be denoted by 
[4o, ..., Àm]. Note that [Ao ..., Àm] and [uÀào. ..., Màm], u #0, represent the 
same hyperplane. The lines (i.e. hyperplanes) in Figs. 2, 3 have been labeled in 
this way. Clearly a point (ao,...,@m) is on the hyperplane [Ao .. ., Àm] iff 
$5 Adi = 0. 
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Problems. (3) Show that the points of PG(m, q) can be uniquely labeled by 
making the left-most nonzero coordinate equal to | (as in Example 2). 

(4) Show that PG(m, q) is constructed from a vector space V of dimension 
m + l over GF(q) by taking the 1-dimensional subspaces of V to be the points 
of PG(m, q) and the 2-dimensional subspaces to be the lines. 

(5) Find the four missing lines in Fig. 3. 

(6) Construct PG(3, 2). 


The affine or Euclidean projective geometry EG(m, q). This is obtained from 
PG(m, q) by deleting the points of a hyperplane H (it doesn't matter which one, 
by Corollary 8). For example, deleting the line [100] from Fig. 2 gives the EG(2, 2) 


In general if we choose H to be the hyperplane [1 0---0] consisting of all 
points with a. = 0, we are left with the points whose coordinates can be taken 
to be (1,a,,..., am). In this way the q” points of EG(m, q) can be labeled by 
the m-tuples (a1,...,@m), a; E GF(q). 

Again we make the convention that EG(— l, q) is empty, EG(0, q) is the 
point 0, and EG(1, q) is a line. 


Problem. (7) Show that the dimension (as defined above) of EG(m, q) is equal 
to m. Show that EG(m, q) is also a vector space of dimension m over GF(q). 


Remark. The nonzero elements of GF(q"*') represent the points of PG(m, q), 
but there are q — | elements sitting on each point. For example, take GF(4) = 
(0,1, w, w}. The elements 100, #00, 700 of GF(4’) all represent the point 
(100) of PG(2,4). The line through (100) and (010) contains the five points 


100 (or «00 or «?00), 
010 (or 090 or 0w’0), 

110 (or ww0 or w`w"0), 
190 (or ww’ or w?10), 
1-970 (or w10 or w°w0). 


The points of the affine geometry are all elements of GF(q"). 


Desarguesian geometries. If the dimension exceeds 2 all projective and affine 
geometries come from finite fields. But in dimension 2 things can be more 
complicated. 
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Theorem 1. If m 23 then a finite projective geometry of dimension m is a 
PG(m, q) for some q, and an affine geometry of dimension m is an EG(m, q) 
for some q. 


PG(m, q) is called a Desarguesian geometry since Desargues’ theorem holds 
there. 

The proof of Theorem | is in three steps. (i) A projective geometry of 
dimension m 72 is one in which Desargues’ theorem holds; (ii) the points of a 
Desarguesian geometry can be given coordinates from a possibly noncom- 
mutative field S; (iii) if S is finite it is commutative, hence S = GF(q) for 
some q. For the details see Artin [29, Ch. 2], Baer [56, Ch. 7], Dembowski 
[370, Ch. 1], Herstein [642, p. 70], Veblen and Young [1368, Vol. 1, Ch. 2]. 


83. Properties of PG(m, q) and EG(m, q) 
Subspaces of PG(m,q) 


Problem. (8) Show that if S is a subspace of PG(m, q) then S is a PG(r, q) for 
some r, O r « m, and that S may be defined as the set of points satisfying 
m — r independent homogeneous linear equations. 

This implies that the intersection of two distinct PG(m— l, qys in a 
PG(m, q) is a PG(m —2, q) (since the points satisfy two linear equations). The 
intersection of a PG(m — |, q) and a PG(m — 2, q) is either the PG(m —2, q) 
or a PG(m —3, q), and so on. The intersection of a PG(m — 1, q) and a line is 
either the line or a point. 

In general, the intersection of a PG(r, q) (defined by m — r equations) and a 
PG(s, q) (m —s equations) has dimension r, r— l,...,orr— m + s, supposing 
szr.If r—~m+s<0, the subspaces may be disjoint. 


Principle of duality. Since points and hyperplanes in a PG(m,q) are both 
represented by (m + 1)-tuples, there is a natural | — |-correspondence between 
them, with the point p corresponding to the dual hyperplane [p]. Similarly 
there is a | — I-correspondence between lines and subspaces PG(m — 2, q), 
with the line (pq) corresponding to the dual subspace fp] N fq}. This cor- 
respondence has the property that if p is on the line (qr) then the dual 
hyperplane [p] contains the dual subspace [q] N [r]. 

Similarly there is a 1 — l-correspondence (the technical term is a cor- 
relation) between subspaces of dimension r and subspaces of dimension 
m — r — |, which preserves incidence. For example if two PG(r, q)’s meet ina 
point then the dual PG(m —r— l, qYs span a hyperplane. 

This correspondence justifies the principle of duality, which says that any 
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statement about PG(m,q) remains true if we interchange "point" and 
“hyperplane,” "PG(r,q)' and "PG(m—-r—1,q)," “intersect” and "span," 
and "contained in" and "contained by." 

An important application of this principle is: 


Theorem 2. If s >r, the number of subspaces PG(s, q) in PG(m, q) which 
contain a given PG(r, q) is equal to the number of PG(m — s — 1, q) contained 
in a given PG(jm—r—1,q). 


Problem. (9) Prove directly the special case of Theorem 2 which says that the 
number of lines through a point is equal to the number of points on a 
hyperplane. 


The number of subspaces 


Theorem 3. The number of PG(r, q) contained in a PG(m, q) is 


(q7*' - Iq" - a): (q" 7g) [m*1 
EE CE percer ea x 


where 
ie + | 
r+1 


is a Gaussian binomial coefficient defined in Problem 3 of Ch. 15. 


Proof. The numerator of (2) is the number of ways of picking r+ 1 in- 
dependent points in PG(m, q), to define a PG(r, q). However, many of these 
sets of points determine the same PG(r,q), so we must divide by the 
denominator of (2), which is the number of ways of picking r+ 1 independent 
points in a PG(r, q). Q.E.D. 


A similar argument proves: 


Theorem 4. In PG(m,q) let R-PG(r,q)C S-PG(s,q). The number of 
subspaces T of dimension t with RCT CS is 


[1] 


Problem. (10) Use Theorem 3 to show that the number of PG(r, q) contained 
in PG(m, q) is equal to the number of PG(m — r — 1, q) contained in PG(m, q). 
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Subspaces and flats of EG(m, q). A subspace S of EG(m, q) is called a flat. 


Problem. (11) Show that if a flat contains the origin then it is a linear subspace 
of EG(m, q) regarded as a vector space; and that a flat not containing the 
origin is a coset of a linear subspace. 

Thus a flat of dimension r in EG(m, q) is a coset of an EG(r, q), and will be 
referred to as an EG(r, q) or an r-flat. A subspace PG(r, q) of PG(m, q) is 
also called an r-flat. 


Theorem 5. The number of EG(r, q) in an EG(m, q) is 
zu 
q ri 


Proof. Let EG(m, q) be obtained from PG(m, q) by deleting the hyperplane H. 
A PG(r, q) either meets H in a PG(r— 1l, q) or is contained in H. Thus the 
desired number is the difference between the number of PG(r, q) in PG(m, q) 
and the number of PG(r, q) in H. By Theorem 3 this is 


eae [7 Jj z «lj 


by Problems 3(b), 3(e) of Ch. 15. Q.E.D. 


Theorem 6. In EG(m, q) let R =EG(r,q)C S = EG(s, q), where r1. The 
number of flats T of dimension t with RC TCS is 


Ten 
t-r] 
Proof. Follows from Theorem 4. Q.E.D. 


Note that in a projective geometry two hyperplanes always meet in a 
subspace of dimension m — 2, whereas in an affine geometry two hyperplanes 
may meet in a subspace of dimension m — 2 or not at all. Disjoint hyperplanes 
are called parallel. 


Problem. (12) Show that EG(m, q) can be decomposed into q mutually parallel 
hyperplanes. 


The collineation group of PG(m, q). 
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Definition. A collineation of a projective or affine geometry is a permutation 
of its points which maps lines onto lines. It follows that every subspace is 
mapped onto a subspace of the same dimension. 

For example, the permutation ((011), (010), (001)) ((110), (111), (100)) is a 
collineation of PG(2, 2) - see Fig. 2. 

The set of all collineations of PG(m,q) forms its collineation group. 
Suppose q = p° where p is a prime. 

Recall from Theorem 12 of Ch. 4 that the automorphism group of the field 
GF(p*) is a cyclic group of order s generated by 


T: B>B’, BEGF(p*). 


Clearly o, is a collineation of PG(m, p^). 
Let C be an invertible (m +1)x(m +1) matrix over GF(p*). Then the 
permutation of the points of PG(m, p^) given by 


(ao, . oe , Am) > (Ao, sies , Q4) C 


is also a collineation. Clearly C and AC, A £0, are the same collineation. 
Together o, and the matrices C generate a group consisting of the 
permutations 
(do, .... Q4) 9 (Q5, .... a$)C, OSI<S. (3) 


ml 


— q') such permutations, but only 


Tiea (4) 


There are s [™o(q 


distinct collineations. This group of collineations is denoted by PIL,.i(q), 
q-p. 


Theorem 7. (The fundamental theorem of projective geometry.) PIL,,.i(q) is 
the full collineation group of P G(m, q). 


For the proof see for example Artin [29,p. 88], Baer [56,Ch. 3] or 
Carmichael [250, p. 360]. 
Since PFL,.(q) is doubly transitive we have: 


Corollary 8. There is essentially only one way of obtaining EG(m,q) from 
PGm, q). 


Corollary 9. The full collineation group of EG(m,q) is the subgroup of 
PIL,.i(q) which fixes the hyperplane at infinity (setwise), and has order 





Appendix B. $4. Finite geometries 701 


s [[ «a7 - 45. (5) 


(See for example Carmichael [250, p. 374]). 


Problem. (13) Given EG(m, q) show that there is essentially only one way to 
add a hyperplane and obtain P G(m, q). 


84. Projective and affine planes 


A projective geometry of dimension 2 is a projective plane. Unlike the 
situation in higher dimensions, a projective plane need not be a PG(2, q) for 
any q. 

In 85 of Ch. 2 we defined a projective plane to be a Steiner system 
S(2,n+1,n?+n+1) for some n z2, or in other words (Definition 2): a 
collection of n?+n + 1 points and n?+n+1 lines, with n + 1 points on each 
line and a unique line containing any two points. 

However, the best definition of a projective plane is this. Definition 3. A 
projective plane is a collection of points and lines satisfying (1) there is a 
unique line containing any two points, (ii) any two distinct lines meet at a 
unique point, and (iii) there exist four points no three of which lie on a line. 


Theorem 10. The three definitions of a projective plane are equivalent. 


Sketch of Proof. Definition | ($2) > Definition 3. It is only necessary to show 
that any two lines meet. This follows because otherwise the two lines would 
contain four independent points and the dimension would not be 2. 

Definition 3 > Definition 2. Take two points p, q and a line L not contain- 
ing them. Then the number of lines through p (or through q) is equal to the 
number of points on L. Call this number n + l. Then the total number of 
points (or lines) is n(n+1)+1l=n?+n+4+1. 

Definition 2 2 Definition |. To prove (iii) we show that any two lines meet. 
This follows from evaluating in two ways the sum of x(p, L, M) over all 
points p and distinct lines L, M, where x(p, L, M)- 1 if p- LAM, =0 
otherwise. The dimension is 2, for if p, q, r, s are independent points then the 
lines (pq) and (rs) do not meet. Q.E.D. 


A Steiner system S(2, n + 1, n? * n + 1) is called a projective plane of order 
n. Thus Figs. 2,3 show projective planes of orders 2 and 3. In general a 
PG(2, q) is a projective plane of order q. From Theorem 7 of Ch. 4, this gives 
Desarguesian projective planes of all prime power orders. 
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However, not all projective planes are Desarguesian. In fact non-Desar- 
guesian planes are known of all orders n = p* >8 where p is a prime and 
e>1. For n <8 we have 


Theorem 11. The projective planes of orders n = 2,3,4,5,7,8 are unique (and 
are the Desarguesian planes PG(2, n)). 


For the proof see Problem 13 of Ch. 20, and the references on p. 144 of 
Dembowski [370]. 

We know from Problem 11 of Ch. 19 that there is no projective plane of 
order 6. This is a special case of 


Theorem 12. (Bruck and Ryser.) If n =1 or 2 (mod 4) and if n is not the sum 
of two squares then there is no projective plane of order n. 


For the proof see Hall [587, p. 175] or Hughes and Piper [674, p. 87]. 

Thus planes are known of orders 2, 3, 4, 5, 7, 8,9, 11, 13, 16, 17, 19, . p" , orders 
6, 14, 21,...do not exist by Theorem 12, and orders 10, 12, 15, 18, 20,...are 
undecided. For the connection between codes and orders n =2 (mod 4) see 
Problem 11 of Ch. 19. 


Affine or euclidean planes. An affine geometry of order 2 is an affine plane, 
and is obtained by deleting the points of a fixed line from a projective plane. 
A second definition was given in $5 of Ch. 2: an affine plane is an S(2, n, n°), 
n 22. A third definition is this. An affine plane is a collection of points and 
lines satisfying (i) there is a unique line containing any two points, (ii) given 
any line L and any point p € L there is a unique line through p which does not 
meet L, and (iii) there exist three points not on a line. Again the three 
definitions agree, and we call an S(2, n, n?) an affine plane of order n. Then the 
results given above about the possible orders of projective planes apply also 
to affine planes. 


Notes on Appendix B 

Projective geometries are discussed by Artin [29], Baer [56], Biggs [143], 
Birkhoff [152, Ch. 8], Carmichael [250], Dembowski [370], Hall [583, Ch. 12], 
MacNeish [869], Segre [1173], and Veblen and Young [1368]. References on 
projective planes are Albert and Sandler [20], Hall [582, Ch. 20 and 587, Ch. 
12], Segre [1173] and especially Dembowski [370] and Hughes and Piper [674]. 
For the numbers of subspaces see for example Carmichael [250] or Goldman 
and Rota [519]. See also the series of papers by Dai, Feng, Wan and Yang 
[103, 104, 1441, 1457-1460, 1474]. 
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relation, 652 
ring, 189, 609 

centralizer, 672 
RM, see code, Reed-Muller 
root of unity, 196 
RS, see code, Reed-Solomon 
run, 264, 406, 514 


scrambling, 406, 431 
sequence, de Bruijn, 431 
m-, 408 
PN, 406ff., 431 
pseudo-random, 406ff., 431 
sequential code reduction, 394 
series, Molien, 600 
sextet, 642 
Shift register, 89, 407 
synthesis, 275, 369 
Shift-and-add, 410 
shorten, 29, 592 
signal set, 31 
sphere, 10, 41 
-packing, viii, 633 
stabilizer, 637 
standard array, 16 
Steiner system, 59, 528, 634, 641, 692 
Stirling’s formula, 309 
subspace, 697 
summary, of alternant codes, 335 
of Ch. 1, 34 
of Chien-Choy code, 360 
of Goethals code, 477 
of Goppa code, 339 
of Hamming code, 25 
of Kerdock code, 456 
of Pless symmetry code, 511 
of Preparata code, 471 
of QR code, 482, 495 
of RM code, 376 
of RS code, 303 
of Srivastava code, 357 
support, 177 
symbol error rate, 20 
syndrome, 16ff,, 213, 270ff., 365 
systematic, 302 
syzygy. 611 


table, BCH codes, 204, 267 
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table, BCH codes (contd.) Perron, 519 
best codes, 674ff. Shannon, 22 
best constant weight codes, 684ff. Singer, 398 
bounds, 564, 674ff. Tietáváinen-Van Lint, 180 
cosets of RM code, 418 Venturini-Patel, 552 
cyclotomic cosets, 105 trace, 116, 208, 573 
Delsarte-Goethals codes, 462 transform, 136 
double circulant codes, 509 discrete Fourier, 53, 239 
factors of x" + 1, 198 fast Fourier, 421ff. 
finite fields, 110ff., 124 fast Hadamard, 422, 432 
5-designs, 512 Hadamard, 53, 127, 414, 419ff. 
GF(16), 85 MacWilliams, 137 
GF(9), 95 truth table, 371 
idempotents, 221ff. two-weight codes, 228, 445 
intersection numbers, 68 
irreducible polynomials, 112, 124 uniqueness, 634, 641, 645ff., 648, 702 
linear codes, 674 unitial, 528 


minimal polynomials, 108ff. 
optimal linear codes, 556 


primitive polynomials, 112 valency, 652 


QR codes, 483 a 
RM codes, 376 weighing design, 52 
self-dual codes, 626 weight, 8 . 
split weight enumerators, 623 from MS polynomial, 249ff. 
tactical configuration, 58 Hamming, 8 
ternary, 7 minimum, 10 
tetrad, 642 real, 396 Wier 
theorem, Assmus-Mattson, 177, 187 weight distribution or enumerator, 40, 126, 
Bürmann-Lagrange, 627 : 135, 251, 255 
central limit, 287 bi-, 148 
Chinese remainder, 305 complete, 141, 597, 617 
Christoffel-Darboux, 153, 559 exact, 147 
complementary slackness, 537 extremal, 624 
Dedekind, 114 Hamming, 146 
Delsarte-Goethals, 667 joint, 147 
Dickson, 438 Lee, 145 
duality, 537 of coset, 132, 166ff. 
Fermat, 96 of Delsarte-Goethals code, 477 
fundamental, of projective geometry, 700 of dual BCH code, 451ff., 669 
Gleason, 602ff., 617ff. of dual code, 127, 144ff. 
Gleason-Pierce-Turyn, 597 of 1st order RM code, 445 
Gleason-Prange, 492 of Golay codes, 67, 69, 598 
Hadamard, 51 of Hamming code, 129 
Kasami-Tokura, 446 of Kerdock code, 456, 460 
Levenshtein, 49ff., 531 of MDS code, 319 
Lloyd, 176, 179 of minimal code, 227 
Lucas, 404 of Preparata code, 460, 468 
MacWilliams, 126ff., 144ff. of RM code, 445ff. 
MacWilliams-Mann, 385, 404 of RM cosets, 22, 415 
McEliece, 447 of 2nd order RM code, 434ff. 
Molien, 600, 612 Split, 149 
Noether, 611 window property, 410 


normal basis, 122 
Parseval, 416 zero of code, 199 
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